Re: [GRASS-dev] large vector problems

2009-02-26 Thread Moritz Lennert

On 26/02/09 08:56, Vincent Bain wrote:

Hello,
reading this thread, and being sometimes concerned with large vector
files (associated with big related tables), I wonder if it's worth
manually creating indexes (on cat field) : can it be a effective way to
speed up queries, or is the kink elsewhere, at the geometric level (data
handled by grass, not the linked DBMS) ?



AFAIK, an index on the cat field is created automatically when a new 
vector is created, but if you link an existing table to a vector map, no 
indices are created. So, yes, creating an index on your cat field can 
make a significant difference for operations involving that field...


Moritz
___
grass-dev mailing list
grass-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/grass-dev


Re: [GRASS-dev] large vector problems

2009-02-25 Thread Markus Metz


Wouter Boasson wrote:

Dear Markus, Markus and Jens,

My MacBook survived the burn-in test :-) After 29 hours under full load,
alternating processor and disk access limited, I got my mega-vector file
cleaned. Thank you for the support and suggestions to solve my problems
with the large vector cleaning operation!
  

Glad to hear that it worked in the end!

Although the dataset was cleaned, files of this size are virtually
impossible to handle, especially as standard querying, extract and overlay
operations with raster datasets simply take too much time. 
There are ways to improve both speed and memory consumption. There are 
hints in the source code and I have some ideas, but this is no easy 
task. The grass vector model is complex, changes need a lot of testing 
before they can be applied. And there are not that many developers 
working on the core grass vector libraries... This will only happen in 
grass7, I guess. Hopefully sometime this year...

I'll post a few related issues with mega files that makes working with them
very difficult. I'll post them as (separate) enhancements on trac, as they
are in my opinion of major importance:
-selecting a large vector map from a dropdown box in the wxPython GUI takes
a long time
-renaming this vector took 25 minutes (PostgreSQL access!)
-v.extract is also incredibly slow
-removing a vector file with an unreachable PostgreSQL database link does
not work, not even in force mode
-v.what consuming several GB of RAM only for querying a large vector map??
  

Some of the above operations could be improved, but it will take some time.

-v.rast.stats suffers from setting masks, extracting polygons and querying,
not usefull anymore for vector files this size, this is a particular slow
operations
  
Try the example script in the help page of r.univar.zonal, available in 
the grass-addons:

http://grass.osgeo.org/wiki/GRASS_AddOns#r.univar.zonal
It should be easy to modify it to your needs, it does something very 
similar to v.rast.stats, only faster.


Best regards,

Markus M

___
grass-dev mailing list
grass-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/grass-dev


Re: [GRASS-dev] large vector problems

2009-02-25 Thread Vincent Bain
Hello,
reading this thread, and being sometimes concerned with large vector
files (associated with big related tables), I wonder if it's worth
manually creating indexes (on cat field) : can it be a effective way to
speed up queries, or is the kink elsewhere, at the geometric level (data
handled by grass, not the linked DBMS) ?

Thank you,
Vincent

Le jeudi 26 février 2009 à 08:42 +0100, Markus Metz a écrit :
 Wouter Boasson wrote:
  Dear Markus, Markus and Jens,
 
  My MacBook survived the burn-in test :-) After 29 hours under full load,
  alternating processor and disk access limited, I got my mega-vector file
  cleaned. Thank you for the support and suggestions to solve my problems
  with the large vector cleaning operation!

 Glad to hear that it worked in the end!
  Although the dataset was cleaned, files of this size are virtually
  impossible to handle, especially as standard querying, extract and overlay
  operations with raster datasets simply take too much time. 
 There are ways to improve both speed and memory consumption. There are 
 hints in the source code and I have some ideas, but this is no easy 
 task. The grass vector model is complex, changes need a lot of testing 
 before they can be applied. And there are not that many developers 
 working on the core grass vector libraries... This will only happen in 
 grass7, I guess. Hopefully sometime this year...
  I'll post a few related issues with mega files that makes working with them
  very difficult. I'll post them as (separate) enhancements on trac, as they
  are in my opinion of major importance:
  -selecting a large vector map from a dropdown box in the wxPython GUI takes
  a long time
  -renaming this vector took 25 minutes (PostgreSQL access!)
  -v.extract is also incredibly slow
  -removing a vector file with an unreachable PostgreSQL database link does
  not work, not even in force mode
  -v.what consuming several GB of RAM only for querying a large vector map??

 Some of the above operations could be improved, but it will take some time.
  -v.rast.stats suffers from setting masks, extracting polygons and querying,
  not usefull anymore for vector files this size, this is a particular slow
  operations

 Try the example script in the help page of r.univar.zonal, available in 
 the grass-addons:
 http://grass.osgeo.org/wiki/GRASS_AddOns#r.univar.zonal
 It should be easy to modify it to your needs, it does something very 
 similar to v.rast.stats, only faster.
 
 Best regards,
 
 Markus M
 
 ___
 grass-dev mailing list
 grass-dev@lists.osgeo.org
 http://lists.osgeo.org/mailman/listinfo/grass-dev
 

___
grass-dev mailing list
grass-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/grass-dev