Re: [GRASS-user] improve v.rast.stats speed?

2009-02-21 Thread Markus Metz


Hamish wrote:

Jose Gómez-Dans wrote:
  

My take on this is to rasterize my vector data with gdal_rasterize (you
can have a look at the rasterisation code and see how it works, in case
you need to eg buffer your vector data), load it up in python, load my
dataset in python, and calculate whatever stats with scipy+numpy. If 
you look at this thread, you'll find it is very fast: 
http://article.gmane.org/gmane.comp.python.scientific.user/19412.



numpy is already requested by the new wxGUI*, so with numpy around anyway,
maybe some python module could be written for grass7, where python is a
full dependency?

* see gui/wxpython/gui_modules/profile.py

  
I think too that grass should provide a reasonably fast way to get this 
kind of stats. You can still devise your own solution if you want, but 
IMHO grass must be able to do this job reasonably fast and user-friendly.
Taking the risk of becoming annoying: with r.univar.zonal, everything 
could be done in one pass: rasterize vector, no need for mapcalc, run 
r.univar.zonal once (which itself needs only one pass), load stats to 
attribute table, done. With the example that started this thread, 
everything should be completed in very few minutes. Rasterizing the 
vector might take the longest.


Anyway, when it comes to processing time, I'm a speed junky, and 5 
hours is simply unacceptable if it can also be done in minutes or even 
seconds, and grass should do that, not forcing users to come up with 
their own workarounds for something that grass is supposed to do.


Markus M

___
grass-user mailing list
grass-user@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/grass-user


Re: [GRASS-user] improve v.rast.stats speed?

2009-02-21 Thread Nikos Alexandris
Dylan:
 OK. This is the old stable branch (I think). If you can get 2.0 to
 compile I would suggest trying that. 

Dylan, which one is 2.0 for linux? Can't trace it.

Thanks, Nikos

___
grass-user mailing list
grass-user@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/grass-user


Re: [GRASS-user] improve v.rast.stats speed?

2009-02-21 Thread Markus Neteler
On Sat, Feb 21, 2009 at 11:19 PM, Nikos Alexandris
nikos.alexand...@felis.uni-freiburg.de wrote:
 Dylan:
 OK. This is the old stable branch (I think). If you can get 2.0 to
 compile I would suggest trying that.

 Dylan, which one is 2.0 for linux? Can't trace it.

He meant Starspan. but I don't see a 2.x version:

http://starspan.casil.ucdavis.edu/doku/doku.php?id=download

?
Markus
___
grass-user mailing list
grass-user@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/grass-user


Re: [GRASS-user] improve v.rast.stats speed?

2009-02-21 Thread Nikos Alexandris
On Sat, 2009-02-21 at 23:53 +0100, Markus Neteler wrote:
 On Sat, Feb 21, 2009 at 11:19 PM, Nikos Alexandris
 nikos.alexand...@felis.uni-freiburg.de wrote:
  Dylan:
  OK. This is the old stable branch (I think). If you can get 2.0 to
  compile I would suggest trying that.
 
  Dylan, which one is 2.0 for linux? Can't trace it.
 
 He meant Starspan. but I don't see a 2.x version:
 
 http://starspan.casil.ucdavis.edu/doku/doku.php?id=download
 
 ?
 Markus

Exactly.

___
grass-user mailing list
grass-user@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/grass-user


Re: [GRASS-user] improve v.rast.stats speed?

2009-02-20 Thread Giovanni Pasini

G. Allegri ha scritto:

Thanks for the ideas.
I've just tried Starspan but it's performance is still too slow. I've
let it run for 15 minutes...

r.statistics is probably the best solution. I've investigated the
ArcGIS method and it actually seems to use a similar method
(ratserization of the features and various automations to join the
results). In fact they call the module zonal statistics that is
generally a set of raster basded methods.

the only limitation of the actual r.statistics is that it works only
with CELL and not float. Ok, I can multiply my values and convert to
CELL, but we could try to let r.statistics deal with floats too...



see flags on r.statistics:

-c Cover values extracted from the category labels of the cover map

Setting the -c flag the category lables of the covering raster layer 
will be used. This is nice to avoid the GRASS limitation to interger in 
raster maps because using category values floating point numbers can be 
stored. [1]



[1] http://grass.itc.it/grass63/manuals/html63_user/r.statistics.html

Ciao,
giovanni
___
grass-user mailing list
grass-user@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/grass-user


Re: [GRASS-user] improve v.rast.stats speed?

2009-02-20 Thread Jose Gómez-Dans
Hi,

On Thursday 19 February 2009 13:20:52 G. Allegri wrote:
 I've just tried Starspan but it's performance is still too slow. I've
 let it run for 15 minutes...

I think I've seen your name in the scipy mailing list. My take on this is to 
rasterize my vector data with gdal_rasterize (you can have a look at the 
rasterisation code and see how it works, in case you need to eg buffer your 
vector data), load it up in python, load my dataset in python, and calculate 
whatever stats with scipy+numpy. If  you look at this thread, you'll find it 
is very fast: 
http://article.gmane.org/gmane.comp.python.scientific.user/19412.

Hope that helps!
Jose

-- 
Remote Sensing Unit   | Env. Monitoring and Modelling Group
Dept. of Geography| Dept. of Geography
University College London | King's College London
Gower St, London WC1E 6BT UK  | Strand Campus, Strand, London WC2R 2LS UK
___
grass-user mailing list
grass-user@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/grass-user


Re: [GRASS-user] improve v.rast.stats speed?

2009-02-20 Thread Dylan Beaudette
On Thursday 19 February 2009, G. Allegri wrote:
 Hi Dylan.
 I didn't let it finish because 15 minutes were too many for my task.
 Ok, less then 5 hours and more of v.rast.stats, but too much respect
 to ArcGIS and the rasterization solution in GRASS.
 I've built the 1.2.03 version, downloaded from [1].
 Anyway I suspect the same about GRASS driver inefficiencies in GDAL/OGR

 [1]
 http://projects.atlas.ca.gov/frs/download.php/667/starspan-1.2.03.tar.gz

OK. This is the old stable branch (I think). If you can get 2.0 to compile I 
would suggest trying that. Starspan really needs to make it into OSGeo so 
that more eyes can get in on the development + bug tracking. At one point it 
was considerably faster than zonal stats in ArcGIS. I am planning on spending 
more time on Starspan from May.

Cheers,

Dylan


 2009/2/19 Dylan Beaudette dylan.beaude...@gmail.com:
  On Thu, Feb 19, 2009 at 5:20 AM, G. Allegri gioha...@gmail.com wrote:
  Thanks for the ideas.
  I've just tried Starspan but it's performance is still too slow. I've
  let it run for 15 minutes...
 
  Hi,
 
  Did you ever let it finish? Can you post the version number? I have
  noticed that starspan tends to be slower when using GRASS vector and
  raster features-- probably a combination of inefficiencies in GDAL/OGR
  with the GRASS formats.
 
 
  Dylan
 
  r.statistics is probably the best solution. I've investigated the
  ArcGIS method and it actually seems to use a similar method
  (ratserization of the features and various automations to join the
  results). In fact they call the module zonal statistics that is
  generally a set of raster basded methods.
 
  the only limitation of the actual r.statistics is that it works only
  with CELL and not float. Ok, I can multiply my values and convert to
  CELL, but we could try to let r.statistics deal with floats too...
 
  I will try to batch the process and let you know the results.
 
  2009/2/19 Markus Metz markus.metz.gisw...@googlemail.com:
  Markus Metz wrote:
  G. Allegri wrote:
  Hello list.
  Yesterday I needed to use v.rast.stats on a 1793 areas covering a
  4415x6632 raster (with resolution 50m/pixel). I've used it without
  extended statistics but the processing time was, with an euphemism,
  very very long. After 5 hours it wasn't finished yet. As I needed it
  for today morning I've decided to reproduce it with ArcGIS: 40
  seconds. I've tried to investigate what was going wrong, the
  bottleneck, but at the end I suppose that it's a problem of the
  script itself (the looping chain of r.mapcalc and r.univar, the
  creation and deletion of the MASK in each loop).
  Is there any way to improve the performance of v.rast.stats? Should
  we rewrite it in C and avoid the use of MASKs?
 
  I have two ideas.
  1) Use r.reclass instead of r.mapcalc to create new masks. That should
  speed up at least the MASK creation and deletion
  2) Avoid the loop and MASK creation altogether. Run r.univar
  map=tmpname,raster. Process the output of r.univar, separate stats for
  the different vector areas and convert to sql statements. Proceed as
  before. r.univar would be called only once. I'm not sure if this is
  possible. I also don't know if the speed gain by avoiding the loop is
  annihilated by r.univar having to process two rasters as input.
 
  Idea 2 is nonsense, I hoped for some behaviour like in r.statistics.
 
  ___
  grass-user mailing list
  grass-user@lists.osgeo.org
  http://lists.osgeo.org/mailman/listinfo/grass-user



-- 
Dylan Beaudette
Soil Resource Laboratory
http://casoilresource.lawr.ucdavis.edu/
University of California at Davis
530.754.7341
___
grass-user mailing list
grass-user@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/grass-user


Re: [GRASS-user] improve v.rast.stats speed?

2009-02-19 Thread G. Allegri
Thanks for the ideas.
I've just tried Starspan but it's performance is still too slow. I've
let it run for 15 minutes...

r.statistics is probably the best solution. I've investigated the
ArcGIS method and it actually seems to use a similar method
(ratserization of the features and various automations to join the
results). In fact they call the module zonal statistics that is
generally a set of raster basded methods.

the only limitation of the actual r.statistics is that it works only
with CELL and not float. Ok, I can multiply my values and convert to
CELL, but we could try to let r.statistics deal with floats too...

I will try to batch the process and let you know the results.

2009/2/19 Markus Metz markus.metz.gisw...@googlemail.com:


 Markus Metz wrote:

 G. Allegri wrote:

 Hello list.
 Yesterday I needed to use v.rast.stats on a 1793 areas covering a
 4415x6632 raster (with resolution 50m/pixel). I've used it without
 extended statistics but the processing time was, with an euphemism,
 very very long. After 5 hours it wasn't finished yet. As I needed it
 for today morning I've decided to reproduce it with ArcGIS: 40
 seconds. I've tried to investigate what was going wrong, the
 bottleneck, but at the end I suppose that it's a problem of the script
 itself (the looping chain of r.mapcalc and r.univar, the creation and
 deletion of the MASK in each loop).
 Is there any way to improve the performance of v.rast.stats? Should we
 rewrite it in C and avoid the use of MASKs?


 I have two ideas.
 1) Use r.reclass instead of r.mapcalc to create new masks. That should
 speed up at least the MASK creation and deletion
 2) Avoid the loop and MASK creation altogether. Run r.univar
 map=tmpname,raster. Process the output of r.univar, separate stats for the
 different vector areas and convert to sql statements. Proceed as before.
 r.univar would be called only once. I'm not sure if this is possible. I also
 don't know if the speed gain by avoiding the loop is annihilated by r.univar
 having to process two rasters as input.

 Idea 2 is nonsense, I hoped for some behaviour like in r.statistics.


___
grass-user mailing list
grass-user@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/grass-user


Re: [GRASS-user] improve v.rast.stats speed?

2009-02-19 Thread G. Allegri
 r.statistics2 deals with accumulator-based aggregates, while
 r.statistics3 deals with quantiles.

Great news! thanks Glynn

 There is currently no way to
 calculate the mode, although I'm unsure whether that is a meaningful
 concept for floating-point data.

I agree, mode is about frequency so it could be useful when dealing
with values classes and hystograms.
___
grass-user mailing list
grass-user@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/grass-user


Re: [GRASS-user] improve v.rast.stats speed?

2009-02-19 Thread Dylan Beaudette
On Thu, Feb 19, 2009 at 5:20 AM, G. Allegri gioha...@gmail.com wrote:
 Thanks for the ideas.
 I've just tried Starspan but it's performance is still too slow. I've
 let it run for 15 minutes...

Hi,

Did you ever let it finish? Can you post the version number? I have
noticed that starspan tends to be slower when using GRASS vector and
raster features-- probably a combination of inefficiencies in GDAL/OGR
with the GRASS formats.


Dylan



 r.statistics is probably the best solution. I've investigated the
 ArcGIS method and it actually seems to use a similar method
 (ratserization of the features and various automations to join the
 results). In fact they call the module zonal statistics that is
 generally a set of raster basded methods.

 the only limitation of the actual r.statistics is that it works only
 with CELL and not float. Ok, I can multiply my values and convert to
 CELL, but we could try to let r.statistics deal with floats too...

 I will try to batch the process and let you know the results.

 2009/2/19 Markus Metz markus.metz.gisw...@googlemail.com:


 Markus Metz wrote:

 G. Allegri wrote:

 Hello list.
 Yesterday I needed to use v.rast.stats on a 1793 areas covering a
 4415x6632 raster (with resolution 50m/pixel). I've used it without
 extended statistics but the processing time was, with an euphemism,
 very very long. After 5 hours it wasn't finished yet. As I needed it
 for today morning I've decided to reproduce it with ArcGIS: 40
 seconds. I've tried to investigate what was going wrong, the
 bottleneck, but at the end I suppose that it's a problem of the script
 itself (the looping chain of r.mapcalc and r.univar, the creation and
 deletion of the MASK in each loop).
 Is there any way to improve the performance of v.rast.stats? Should we
 rewrite it in C and avoid the use of MASKs?


 I have two ideas.
 1) Use r.reclass instead of r.mapcalc to create new masks. That should
 speed up at least the MASK creation and deletion
 2) Avoid the loop and MASK creation altogether. Run r.univar
 map=tmpname,raster. Process the output of r.univar, separate stats for the
 different vector areas and convert to sql statements. Proceed as before.
 r.univar would be called only once. I'm not sure if this is possible. I also
 don't know if the speed gain by avoiding the loop is annihilated by r.univar
 having to process two rasters as input.

 Idea 2 is nonsense, I hoped for some behaviour like in r.statistics.


 ___
 grass-user mailing list
 grass-user@lists.osgeo.org
 http://lists.osgeo.org/mailman/listinfo/grass-user

___
grass-user mailing list
grass-user@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/grass-user


Re: [GRASS-user] improve v.rast.stats speed?

2009-02-19 Thread G. Allegri
Hi Dylan.
I didn't let it finish because 15 minutes were too many for my task.
Ok, less then 5 hours and more of v.rast.stats, but too much respect
to ArcGIS and the rasterization solution in GRASS.
I've built the 1.2.03 version, downloaded from [1].
Anyway I suspect the same about GRASS driver inefficiencies in GDAL/OGR

[1] http://projects.atlas.ca.gov/frs/download.php/667/starspan-1.2.03.tar.gz

2009/2/19 Dylan Beaudette dylan.beaude...@gmail.com:
 On Thu, Feb 19, 2009 at 5:20 AM, G. Allegri gioha...@gmail.com wrote:
 Thanks for the ideas.
 I've just tried Starspan but it's performance is still too slow. I've
 let it run for 15 minutes...

 Hi,

 Did you ever let it finish? Can you post the version number? I have
 noticed that starspan tends to be slower when using GRASS vector and
 raster features-- probably a combination of inefficiencies in GDAL/OGR
 with the GRASS formats.


 Dylan



 r.statistics is probably the best solution. I've investigated the
 ArcGIS method and it actually seems to use a similar method
 (ratserization of the features and various automations to join the
 results). In fact they call the module zonal statistics that is
 generally a set of raster basded methods.

 the only limitation of the actual r.statistics is that it works only
 with CELL and not float. Ok, I can multiply my values and convert to
 CELL, but we could try to let r.statistics deal with floats too...

 I will try to batch the process and let you know the results.

 2009/2/19 Markus Metz markus.metz.gisw...@googlemail.com:


 Markus Metz wrote:

 G. Allegri wrote:

 Hello list.
 Yesterday I needed to use v.rast.stats on a 1793 areas covering a
 4415x6632 raster (with resolution 50m/pixel). I've used it without
 extended statistics but the processing time was, with an euphemism,
 very very long. After 5 hours it wasn't finished yet. As I needed it
 for today morning I've decided to reproduce it with ArcGIS: 40
 seconds. I've tried to investigate what was going wrong, the
 bottleneck, but at the end I suppose that it's a problem of the script
 itself (the looping chain of r.mapcalc and r.univar, the creation and
 deletion of the MASK in each loop).
 Is there any way to improve the performance of v.rast.stats? Should we
 rewrite it in C and avoid the use of MASKs?


 I have two ideas.
 1) Use r.reclass instead of r.mapcalc to create new masks. That should
 speed up at least the MASK creation and deletion
 2) Avoid the loop and MASK creation altogether. Run r.univar
 map=tmpname,raster. Process the output of r.univar, separate stats for the
 different vector areas and convert to sql statements. Proceed as before.
 r.univar would be called only once. I'm not sure if this is possible. I 
 also
 don't know if the speed gain by avoiding the loop is annihilated by 
 r.univar
 having to process two rasters as input.

 Idea 2 is nonsense, I hoped for some behaviour like in r.statistics.


 ___
 grass-user mailing list
 grass-user@lists.osgeo.org
 http://lists.osgeo.org/mailman/listinfo/grass-user


___
grass-user mailing list
grass-user@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/grass-user


Re: [GRASS-user] improve v.rast.stats speed?

2009-02-19 Thread Nikos Alexandris
G. Allegri:

 I've built the 1.2.03 version, downloaded from [1].
 Anyway I suspect the same about GRASS driver inefficiencies in GDAL/OGR
 [1] http://projects.atlas.ca.gov/frs/download.php/667/starspan-1.2.03.tar.gz


Giovanni,
sorry for the sort-of off-topic, but how do you build starspan with
GRASS support? Do you work under Ubuntu? I am trying the following:

#configure
./configure --with-grass=/usr/local/grass-6.5.svn

# compile... fails :-(
make

[...]
g++ -DHAVE_CONFIG_H -I. -I./src -I./src -I./src/csv -I./src/jts
-I./src/raster -I./src/rasterizers -I./src/stats -I./src/traverser
-I./src/util -I./src/vector -g -I/usr/local/include
-I/usr/local/include -I/usr/local/grass-6.5.svn/include  -g -O2 -MT
LineRasterizer.o -MD -MP -MF .deps/LineRasterizer.Tpo -c -o
LineRasterizer.o `test -f 'src/rasterizers/LineRasterizer.cc' || echo
'./'`src/rasterizers/LineRasterizer.cc
mv -f .deps/LineRasterizer.Tpo .deps/LineRasterizer.Po
g++ -DHAVE_CONFIG_H -I. -I./src -I./src -I./src/csv -I./src/jts
-I./src/raster -I./src/rasterizers -I./src/stats -I./src/traverser
-I./src/util -I./src/vector -g -I/usr/local/include
-I/usr/local/include -I/usr/local/grass-6.5.svn/include  -g -O2 -MT
Stats.o -MD -MP -MF .deps/Stats.Tpo -c -o Stats.o `test -f
'src/stats/Stats.cc' || echo './'`src/stats/Stats.cc
src/stats/Stats.cc: In member function ‘void
Stats::compute(std::vectorint, std::allocatorint , int)’:
src/stats/Stats.cc:24: error: cannot convert
‘__gnu_cxx::__normal_iteratorint*, std::vectorint, std::allocatorint
 ’ to ‘const char*’ for argument ‘1’ to ‘int remove(const char*)’
src/stats/Stats.cc:102: error: ‘sort’ was not declared in this scope
src/stats/Stats.cc: In member function ‘void
Stats::compute(std::vectordouble, std::allocatordouble , double)’:
src/stats/Stats.cc:143: error: cannot convert
‘__gnu_cxx::__normal_iteratordouble*, std::vectordouble,
std::allocatordouble  ’ to ‘const char*’ for argument ‘1’ to ‘int
remove(const char*)’
src/stats/Stats.cc:232: error: ‘sort’ was not declared in this scope
make: *** [Stats.o] Error 1

___
grass-user mailing list
grass-user@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/grass-user


Re: [GRASS-user] improve v.rast.stats speed?

2009-02-19 Thread G. Allegri
Hi Nikos.

 Giovanni,
 sorry for the sort-of off-topic, but how do you build starspan with
 GRASS support? Do you work under Ubuntu? I am trying the following:

I haven't used it with GRASS datas but with a tiff and a shp.
My hypoteses about GDAL/OGR inefficiencies are not related to this case...
___
grass-user mailing list
grass-user@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/grass-user


Re: [GRASS-user] improve v.rast.stats speed?

2009-02-19 Thread G. Allegri
Thanks Markus. I'm out of office, but I will try your solution as soon
as possible.

2009/2/20 Markus Metz markus.metz.gisw...@googlemail.com:
 Hi Giovanni,

 could you please check out
 svn checkout https://svn.osgeo.org/grass/grass-addons/raster/r.univar2.zonal
 and let me know if the results and the speed are ok?

 r.univar2.zonal does zonal statistics if I understood the concept all right.
 You have to give a zoning map as input. That would be in your case the
 rasterized vector areas. The output can be very long, as in your case, so I
 added an option to dump the stats in a file, that way it's easier to
 scrutinize.
 In my tests, the stats were identical to r.univar2 in develbranch_6, only
 that I had to run r.univar2.zonal only once and not several times. This
 r.univar2.zonal is not polished: 3D support is missing and the output is not
 shell style although it should be, no idea why not. I would like to leave
 the polishing to the experts.

 I haven't added an entry in wiki addons, because I don't know if this is an
 unpolished gem or unnecessary with your current solution or a pile of trash
 (well, not completely trash, results are at least identical to r.univar2).

 Regards,

 Markus M


 G. Allegri wrote:

 The rasterizing method gives comparable performances to ArcGIS. I
 confirm that it does the same.
 The bottleneck is the r.univar limitation to CELL. I have to
 investigate why, as it is based on r.stats which works with
 DCELL/FCELL too

 The final join would be an improvement respect to Zonal Statistics in
 ArcGIS which simply produce a dbf with OIDs from polygon IDs but
 doesn't merge it into the original vectorial.


 2009/2/19 Markus Metz markus.metz.gisw...@googlemail.com:


 Markus Neteler wrote:


 On Thu, Feb 19, 2009 at 2:20 PM, G. Allegri gioha...@gmail.com wrote:



 Thanks for the ideas.
 I've just tried Starspan but it's performance is still too slow. I've
 let it run for 15 minutes...



 Can you please try GRASS 7?
 grass70/scripts/v.rast.stats/v.rast.stats.py



 grass70/scripts/v.rast.stats/v.rast.stats.py also uses r.mapcalc in every
 pass:

 for i in cats:
  ...
  grass.mapcalc(MASK = if(...)

 if v.rast.stats is faster in grass7, then probably because of improved
 raster libs. A speed increase from 5 hours to 40 seconds is unlikely
 since
 grass.mapcalc is still called 1793 times (assuming each area has a unique
 category) for a region with 4415x6632 cells...






___
grass-user mailing list
grass-user@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/grass-user