Re: [GRASS-user] improve v.rast.stats speed?
Hamish wrote: Jose Gómez-Dans wrote: My take on this is to rasterize my vector data with gdal_rasterize (you can have a look at the rasterisation code and see how it works, in case you need to eg buffer your vector data), load it up in python, load my dataset in python, and calculate whatever stats with scipy+numpy. If you look at this thread, you'll find it is very fast: http://article.gmane.org/gmane.comp.python.scientific.user/19412. numpy is already requested by the new wxGUI*, so with numpy around anyway, maybe some python module could be written for grass7, where python is a full dependency? * see gui/wxpython/gui_modules/profile.py I think too that grass should provide a reasonably fast way to get this kind of stats. You can still devise your own solution if you want, but IMHO grass must be able to do this job reasonably fast and user-friendly. Taking the risk of becoming annoying: with r.univar.zonal, everything could be done in one pass: rasterize vector, no need for mapcalc, run r.univar.zonal once (which itself needs only one pass), load stats to attribute table, done. With the example that started this thread, everything should be completed in very few minutes. Rasterizing the vector might take the longest. Anyway, when it comes to processing time, I'm a speed junky, and 5 hours is simply unacceptable if it can also be done in minutes or even seconds, and grass should do that, not forcing users to come up with their own workarounds for something that grass is supposed to do. Markus M ___ grass-user mailing list grass-user@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/grass-user
Re: [GRASS-user] improve v.rast.stats speed?
Dylan: OK. This is the old stable branch (I think). If you can get 2.0 to compile I would suggest trying that. Dylan, which one is 2.0 for linux? Can't trace it. Thanks, Nikos ___ grass-user mailing list grass-user@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/grass-user
Re: [GRASS-user] improve v.rast.stats speed?
On Sat, Feb 21, 2009 at 11:19 PM, Nikos Alexandris nikos.alexand...@felis.uni-freiburg.de wrote: Dylan: OK. This is the old stable branch (I think). If you can get 2.0 to compile I would suggest trying that. Dylan, which one is 2.0 for linux? Can't trace it. He meant Starspan. but I don't see a 2.x version: http://starspan.casil.ucdavis.edu/doku/doku.php?id=download ? Markus ___ grass-user mailing list grass-user@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/grass-user
Re: [GRASS-user] improve v.rast.stats speed?
On Sat, 2009-02-21 at 23:53 +0100, Markus Neteler wrote: On Sat, Feb 21, 2009 at 11:19 PM, Nikos Alexandris nikos.alexand...@felis.uni-freiburg.de wrote: Dylan: OK. This is the old stable branch (I think). If you can get 2.0 to compile I would suggest trying that. Dylan, which one is 2.0 for linux? Can't trace it. He meant Starspan. but I don't see a 2.x version: http://starspan.casil.ucdavis.edu/doku/doku.php?id=download ? Markus Exactly. ___ grass-user mailing list grass-user@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/grass-user
Re: [GRASS-user] improve v.rast.stats speed?
G. Allegri ha scritto: Thanks for the ideas. I've just tried Starspan but it's performance is still too slow. I've let it run for 15 minutes... r.statistics is probably the best solution. I've investigated the ArcGIS method and it actually seems to use a similar method (ratserization of the features and various automations to join the results). In fact they call the module zonal statistics that is generally a set of raster basded methods. the only limitation of the actual r.statistics is that it works only with CELL and not float. Ok, I can multiply my values and convert to CELL, but we could try to let r.statistics deal with floats too... see flags on r.statistics: -c Cover values extracted from the category labels of the cover map Setting the -c flag the category lables of the covering raster layer will be used. This is nice to avoid the GRASS limitation to interger in raster maps because using category values floating point numbers can be stored. [1] [1] http://grass.itc.it/grass63/manuals/html63_user/r.statistics.html Ciao, giovanni ___ grass-user mailing list grass-user@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/grass-user
Re: [GRASS-user] improve v.rast.stats speed?
Hi, On Thursday 19 February 2009 13:20:52 G. Allegri wrote: I've just tried Starspan but it's performance is still too slow. I've let it run for 15 minutes... I think I've seen your name in the scipy mailing list. My take on this is to rasterize my vector data with gdal_rasterize (you can have a look at the rasterisation code and see how it works, in case you need to eg buffer your vector data), load it up in python, load my dataset in python, and calculate whatever stats with scipy+numpy. If you look at this thread, you'll find it is very fast: http://article.gmane.org/gmane.comp.python.scientific.user/19412. Hope that helps! Jose -- Remote Sensing Unit | Env. Monitoring and Modelling Group Dept. of Geography| Dept. of Geography University College London | King's College London Gower St, London WC1E 6BT UK | Strand Campus, Strand, London WC2R 2LS UK ___ grass-user mailing list grass-user@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/grass-user
Re: [GRASS-user] improve v.rast.stats speed?
On Thursday 19 February 2009, G. Allegri wrote: Hi Dylan. I didn't let it finish because 15 minutes were too many for my task. Ok, less then 5 hours and more of v.rast.stats, but too much respect to ArcGIS and the rasterization solution in GRASS. I've built the 1.2.03 version, downloaded from [1]. Anyway I suspect the same about GRASS driver inefficiencies in GDAL/OGR [1] http://projects.atlas.ca.gov/frs/download.php/667/starspan-1.2.03.tar.gz OK. This is the old stable branch (I think). If you can get 2.0 to compile I would suggest trying that. Starspan really needs to make it into OSGeo so that more eyes can get in on the development + bug tracking. At one point it was considerably faster than zonal stats in ArcGIS. I am planning on spending more time on Starspan from May. Cheers, Dylan 2009/2/19 Dylan Beaudette dylan.beaude...@gmail.com: On Thu, Feb 19, 2009 at 5:20 AM, G. Allegri gioha...@gmail.com wrote: Thanks for the ideas. I've just tried Starspan but it's performance is still too slow. I've let it run for 15 minutes... Hi, Did you ever let it finish? Can you post the version number? I have noticed that starspan tends to be slower when using GRASS vector and raster features-- probably a combination of inefficiencies in GDAL/OGR with the GRASS formats. Dylan r.statistics is probably the best solution. I've investigated the ArcGIS method and it actually seems to use a similar method (ratserization of the features and various automations to join the results). In fact they call the module zonal statistics that is generally a set of raster basded methods. the only limitation of the actual r.statistics is that it works only with CELL and not float. Ok, I can multiply my values and convert to CELL, but we could try to let r.statistics deal with floats too... I will try to batch the process and let you know the results. 2009/2/19 Markus Metz markus.metz.gisw...@googlemail.com: Markus Metz wrote: G. Allegri wrote: Hello list. Yesterday I needed to use v.rast.stats on a 1793 areas covering a 4415x6632 raster (with resolution 50m/pixel). I've used it without extended statistics but the processing time was, with an euphemism, very very long. After 5 hours it wasn't finished yet. As I needed it for today morning I've decided to reproduce it with ArcGIS: 40 seconds. I've tried to investigate what was going wrong, the bottleneck, but at the end I suppose that it's a problem of the script itself (the looping chain of r.mapcalc and r.univar, the creation and deletion of the MASK in each loop). Is there any way to improve the performance of v.rast.stats? Should we rewrite it in C and avoid the use of MASKs? I have two ideas. 1) Use r.reclass instead of r.mapcalc to create new masks. That should speed up at least the MASK creation and deletion 2) Avoid the loop and MASK creation altogether. Run r.univar map=tmpname,raster. Process the output of r.univar, separate stats for the different vector areas and convert to sql statements. Proceed as before. r.univar would be called only once. I'm not sure if this is possible. I also don't know if the speed gain by avoiding the loop is annihilated by r.univar having to process two rasters as input. Idea 2 is nonsense, I hoped for some behaviour like in r.statistics. ___ grass-user mailing list grass-user@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/grass-user -- Dylan Beaudette Soil Resource Laboratory http://casoilresource.lawr.ucdavis.edu/ University of California at Davis 530.754.7341 ___ grass-user mailing list grass-user@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/grass-user
Re: [GRASS-user] improve v.rast.stats speed?
Thanks for the ideas. I've just tried Starspan but it's performance is still too slow. I've let it run for 15 minutes... r.statistics is probably the best solution. I've investigated the ArcGIS method and it actually seems to use a similar method (ratserization of the features and various automations to join the results). In fact they call the module zonal statistics that is generally a set of raster basded methods. the only limitation of the actual r.statistics is that it works only with CELL and not float. Ok, I can multiply my values and convert to CELL, but we could try to let r.statistics deal with floats too... I will try to batch the process and let you know the results. 2009/2/19 Markus Metz markus.metz.gisw...@googlemail.com: Markus Metz wrote: G. Allegri wrote: Hello list. Yesterday I needed to use v.rast.stats on a 1793 areas covering a 4415x6632 raster (with resolution 50m/pixel). I've used it without extended statistics but the processing time was, with an euphemism, very very long. After 5 hours it wasn't finished yet. As I needed it for today morning I've decided to reproduce it with ArcGIS: 40 seconds. I've tried to investigate what was going wrong, the bottleneck, but at the end I suppose that it's a problem of the script itself (the looping chain of r.mapcalc and r.univar, the creation and deletion of the MASK in each loop). Is there any way to improve the performance of v.rast.stats? Should we rewrite it in C and avoid the use of MASKs? I have two ideas. 1) Use r.reclass instead of r.mapcalc to create new masks. That should speed up at least the MASK creation and deletion 2) Avoid the loop and MASK creation altogether. Run r.univar map=tmpname,raster. Process the output of r.univar, separate stats for the different vector areas and convert to sql statements. Proceed as before. r.univar would be called only once. I'm not sure if this is possible. I also don't know if the speed gain by avoiding the loop is annihilated by r.univar having to process two rasters as input. Idea 2 is nonsense, I hoped for some behaviour like in r.statistics. ___ grass-user mailing list grass-user@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/grass-user
Re: [GRASS-user] improve v.rast.stats speed?
r.statistics2 deals with accumulator-based aggregates, while r.statistics3 deals with quantiles. Great news! thanks Glynn There is currently no way to calculate the mode, although I'm unsure whether that is a meaningful concept for floating-point data. I agree, mode is about frequency so it could be useful when dealing with values classes and hystograms. ___ grass-user mailing list grass-user@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/grass-user
Re: [GRASS-user] improve v.rast.stats speed?
On Thu, Feb 19, 2009 at 5:20 AM, G. Allegri gioha...@gmail.com wrote: Thanks for the ideas. I've just tried Starspan but it's performance is still too slow. I've let it run for 15 minutes... Hi, Did you ever let it finish? Can you post the version number? I have noticed that starspan tends to be slower when using GRASS vector and raster features-- probably a combination of inefficiencies in GDAL/OGR with the GRASS formats. Dylan r.statistics is probably the best solution. I've investigated the ArcGIS method and it actually seems to use a similar method (ratserization of the features and various automations to join the results). In fact they call the module zonal statistics that is generally a set of raster basded methods. the only limitation of the actual r.statistics is that it works only with CELL and not float. Ok, I can multiply my values and convert to CELL, but we could try to let r.statistics deal with floats too... I will try to batch the process and let you know the results. 2009/2/19 Markus Metz markus.metz.gisw...@googlemail.com: Markus Metz wrote: G. Allegri wrote: Hello list. Yesterday I needed to use v.rast.stats on a 1793 areas covering a 4415x6632 raster (with resolution 50m/pixel). I've used it without extended statistics but the processing time was, with an euphemism, very very long. After 5 hours it wasn't finished yet. As I needed it for today morning I've decided to reproduce it with ArcGIS: 40 seconds. I've tried to investigate what was going wrong, the bottleneck, but at the end I suppose that it's a problem of the script itself (the looping chain of r.mapcalc and r.univar, the creation and deletion of the MASK in each loop). Is there any way to improve the performance of v.rast.stats? Should we rewrite it in C and avoid the use of MASKs? I have two ideas. 1) Use r.reclass instead of r.mapcalc to create new masks. That should speed up at least the MASK creation and deletion 2) Avoid the loop and MASK creation altogether. Run r.univar map=tmpname,raster. Process the output of r.univar, separate stats for the different vector areas and convert to sql statements. Proceed as before. r.univar would be called only once. I'm not sure if this is possible. I also don't know if the speed gain by avoiding the loop is annihilated by r.univar having to process two rasters as input. Idea 2 is nonsense, I hoped for some behaviour like in r.statistics. ___ grass-user mailing list grass-user@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/grass-user ___ grass-user mailing list grass-user@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/grass-user
Re: [GRASS-user] improve v.rast.stats speed?
Hi Dylan. I didn't let it finish because 15 minutes were too many for my task. Ok, less then 5 hours and more of v.rast.stats, but too much respect to ArcGIS and the rasterization solution in GRASS. I've built the 1.2.03 version, downloaded from [1]. Anyway I suspect the same about GRASS driver inefficiencies in GDAL/OGR [1] http://projects.atlas.ca.gov/frs/download.php/667/starspan-1.2.03.tar.gz 2009/2/19 Dylan Beaudette dylan.beaude...@gmail.com: On Thu, Feb 19, 2009 at 5:20 AM, G. Allegri gioha...@gmail.com wrote: Thanks for the ideas. I've just tried Starspan but it's performance is still too slow. I've let it run for 15 minutes... Hi, Did you ever let it finish? Can you post the version number? I have noticed that starspan tends to be slower when using GRASS vector and raster features-- probably a combination of inefficiencies in GDAL/OGR with the GRASS formats. Dylan r.statistics is probably the best solution. I've investigated the ArcGIS method and it actually seems to use a similar method (ratserization of the features and various automations to join the results). In fact they call the module zonal statistics that is generally a set of raster basded methods. the only limitation of the actual r.statistics is that it works only with CELL and not float. Ok, I can multiply my values and convert to CELL, but we could try to let r.statistics deal with floats too... I will try to batch the process and let you know the results. 2009/2/19 Markus Metz markus.metz.gisw...@googlemail.com: Markus Metz wrote: G. Allegri wrote: Hello list. Yesterday I needed to use v.rast.stats on a 1793 areas covering a 4415x6632 raster (with resolution 50m/pixel). I've used it without extended statistics but the processing time was, with an euphemism, very very long. After 5 hours it wasn't finished yet. As I needed it for today morning I've decided to reproduce it with ArcGIS: 40 seconds. I've tried to investigate what was going wrong, the bottleneck, but at the end I suppose that it's a problem of the script itself (the looping chain of r.mapcalc and r.univar, the creation and deletion of the MASK in each loop). Is there any way to improve the performance of v.rast.stats? Should we rewrite it in C and avoid the use of MASKs? I have two ideas. 1) Use r.reclass instead of r.mapcalc to create new masks. That should speed up at least the MASK creation and deletion 2) Avoid the loop and MASK creation altogether. Run r.univar map=tmpname,raster. Process the output of r.univar, separate stats for the different vector areas and convert to sql statements. Proceed as before. r.univar would be called only once. I'm not sure if this is possible. I also don't know if the speed gain by avoiding the loop is annihilated by r.univar having to process two rasters as input. Idea 2 is nonsense, I hoped for some behaviour like in r.statistics. ___ grass-user mailing list grass-user@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/grass-user ___ grass-user mailing list grass-user@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/grass-user
Re: [GRASS-user] improve v.rast.stats speed?
G. Allegri: I've built the 1.2.03 version, downloaded from [1]. Anyway I suspect the same about GRASS driver inefficiencies in GDAL/OGR [1] http://projects.atlas.ca.gov/frs/download.php/667/starspan-1.2.03.tar.gz Giovanni, sorry for the sort-of off-topic, but how do you build starspan with GRASS support? Do you work under Ubuntu? I am trying the following: #configure ./configure --with-grass=/usr/local/grass-6.5.svn # compile... fails :-( make [...] g++ -DHAVE_CONFIG_H -I. -I./src -I./src -I./src/csv -I./src/jts -I./src/raster -I./src/rasterizers -I./src/stats -I./src/traverser -I./src/util -I./src/vector -g -I/usr/local/include -I/usr/local/include -I/usr/local/grass-6.5.svn/include -g -O2 -MT LineRasterizer.o -MD -MP -MF .deps/LineRasterizer.Tpo -c -o LineRasterizer.o `test -f 'src/rasterizers/LineRasterizer.cc' || echo './'`src/rasterizers/LineRasterizer.cc mv -f .deps/LineRasterizer.Tpo .deps/LineRasterizer.Po g++ -DHAVE_CONFIG_H -I. -I./src -I./src -I./src/csv -I./src/jts -I./src/raster -I./src/rasterizers -I./src/stats -I./src/traverser -I./src/util -I./src/vector -g -I/usr/local/include -I/usr/local/include -I/usr/local/grass-6.5.svn/include -g -O2 -MT Stats.o -MD -MP -MF .deps/Stats.Tpo -c -o Stats.o `test -f 'src/stats/Stats.cc' || echo './'`src/stats/Stats.cc src/stats/Stats.cc: In member function ‘void Stats::compute(std::vectorint, std::allocatorint , int)’: src/stats/Stats.cc:24: error: cannot convert ‘__gnu_cxx::__normal_iteratorint*, std::vectorint, std::allocatorint ’ to ‘const char*’ for argument ‘1’ to ‘int remove(const char*)’ src/stats/Stats.cc:102: error: ‘sort’ was not declared in this scope src/stats/Stats.cc: In member function ‘void Stats::compute(std::vectordouble, std::allocatordouble , double)’: src/stats/Stats.cc:143: error: cannot convert ‘__gnu_cxx::__normal_iteratordouble*, std::vectordouble, std::allocatordouble ’ to ‘const char*’ for argument ‘1’ to ‘int remove(const char*)’ src/stats/Stats.cc:232: error: ‘sort’ was not declared in this scope make: *** [Stats.o] Error 1 ___ grass-user mailing list grass-user@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/grass-user
Re: [GRASS-user] improve v.rast.stats speed?
Hi Nikos. Giovanni, sorry for the sort-of off-topic, but how do you build starspan with GRASS support? Do you work under Ubuntu? I am trying the following: I haven't used it with GRASS datas but with a tiff and a shp. My hypoteses about GDAL/OGR inefficiencies are not related to this case... ___ grass-user mailing list grass-user@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/grass-user
Re: [GRASS-user] improve v.rast.stats speed?
Thanks Markus. I'm out of office, but I will try your solution as soon as possible. 2009/2/20 Markus Metz markus.metz.gisw...@googlemail.com: Hi Giovanni, could you please check out svn checkout https://svn.osgeo.org/grass/grass-addons/raster/r.univar2.zonal and let me know if the results and the speed are ok? r.univar2.zonal does zonal statistics if I understood the concept all right. You have to give a zoning map as input. That would be in your case the rasterized vector areas. The output can be very long, as in your case, so I added an option to dump the stats in a file, that way it's easier to scrutinize. In my tests, the stats were identical to r.univar2 in develbranch_6, only that I had to run r.univar2.zonal only once and not several times. This r.univar2.zonal is not polished: 3D support is missing and the output is not shell style although it should be, no idea why not. I would like to leave the polishing to the experts. I haven't added an entry in wiki addons, because I don't know if this is an unpolished gem or unnecessary with your current solution or a pile of trash (well, not completely trash, results are at least identical to r.univar2). Regards, Markus M G. Allegri wrote: The rasterizing method gives comparable performances to ArcGIS. I confirm that it does the same. The bottleneck is the r.univar limitation to CELL. I have to investigate why, as it is based on r.stats which works with DCELL/FCELL too The final join would be an improvement respect to Zonal Statistics in ArcGIS which simply produce a dbf with OIDs from polygon IDs but doesn't merge it into the original vectorial. 2009/2/19 Markus Metz markus.metz.gisw...@googlemail.com: Markus Neteler wrote: On Thu, Feb 19, 2009 at 2:20 PM, G. Allegri gioha...@gmail.com wrote: Thanks for the ideas. I've just tried Starspan but it's performance is still too slow. I've let it run for 15 minutes... Can you please try GRASS 7? grass70/scripts/v.rast.stats/v.rast.stats.py grass70/scripts/v.rast.stats/v.rast.stats.py also uses r.mapcalc in every pass: for i in cats: ... grass.mapcalc(MASK = if(...) if v.rast.stats is faster in grass7, then probably because of improved raster libs. A speed increase from 5 hours to 40 seconds is unlikely since grass.mapcalc is still called 1793 times (assuming each area has a unique category) for a region with 4415x6632 cells... ___ grass-user mailing list grass-user@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/grass-user