On 10/08/10 17:50, Nikos Alexandris wrote:
Hmmm... "dmax=0.0": Is this _my_ problem perhaps? Instead of setting it
directly I was trying to estimate it first with "v.distance -pa" which meands
that I misunderstood the whole process :-/
dmax=0.0 means: only those features that are in the same place, i.e. in
the case of from=points and to=areas => only those points which fall
into areas.
So, using the combination of v.distance and db.select I cannot reproduce
your problem with 600,000 points, but maybe the number and nature of
polygons can also play a role...
That's interesting. Maybe I have done once again something very messy(?). I
use the 3rd script inside the attached file in ticket # 804 [1]. Although this
(old) script still executes so inefficiently a very large number of SQL
statements, the problem is still only in v.what.vect (so in v.distance) before
the SQL calls.
The script counts several point maps (for example: 404347 points) that fall
inside boxes (that compose a fishnet which I call cell-grid, for example: 1320
vector cells). One run with the above mentioned numbers takes more than 10h.
The specific line(s) in the python script is:
# carry low resolution grid-cell "CAT"s over to reference vector points
grass.run_command('v.what.vect',\
flags = '-v',\
quiet = False,\
vector = reference_points_map,\
qvect = lowres_vector_grid,\
column = gridcell_column,\
qcolumn = "cat")
Of course I checked the "problem" with the my data by testing only
"v.what.vect" commands out and apart of my messy script.
Equally, very slow are the trials I did with spearfish (random data). I can
pass some of my data (off-list please) or let me find some time later or
tomorrow to copy-paste from my history the exact commands of my test within
spearfish60.
I just did a similar test with same points and a grid created by
v.mkgrid grid=35,40
(using same column cat_municip from previous test example)
time v.distance from=mypoi...@sqlite to=mygrid upload=cat
column=cat_municip dmax=0.0
real 2m21.205s
Then testing the idea from the link Markus N added to your bug report:
time v.db.update mygrid col=count value="(SELECT count(*) from mypoints
WHERE mygrid.cat=mypoints.cat_municip group by cat_municip)"
real 5m28.312s
One hypothesis I had was that since v.what.vect uses the upload=to_attr
option, thus making it necessary to query the to_map's attribute table,
this might create significant overhead in database connection, but when
using
time v.distance from=mypoi...@sqlite to=mygrid upload=to_attr
column=cat_municip to_column=cat dmax=0.0
I get
real 2m13.741s
so no significant difference...
And final test with v.what.vect:
time v.what.vect mypoints col=cat_municip qvector=mygrid qcolumn=cat
real 2m9.377s
I'm pretty much at large about what causes your problem...
(
Just a quick look: I did not set dmax=0.0 in my (v.distance) tests. Then
again, in "v.what.vect" it is set by default to 0.0, right? Isn't this default
dmax=0.0 passed (by default) to v.distance?
)
Yes.
Moritz
_______________________________________________
grass-dev mailing list
[email protected]
http://lists.osgeo.org/mailman/listinfo/grass-dev