OK, here comes (soon) a speed-up for v.distance test case is nc
I generated 10000 random vector points with r.random, all within North Carolina. As areas I used boundary_municp, scattered areas, some points are within an area, most are outside any area. No dmax used with v.distance Original: about 25s for updating the table, about 6m25s used for distance calculations Tuned: about 25s for updating the table, about 2s :-))) used for distance calculations Results for the 10000 points are identical (distance to nearest area and category of nearest area). The code is now a bit more complicated, but reducing processing time for distance calculations from over 6m down to 2s might justify some code complexity. Markus M Moritz Lennert wrote: > On 10/08/10 15:17, Moritz Lennert wrote: >> >> On 10/08/10 13:49, Nikos Alexandris wrote: >>> >>> Markus M: >>> >>>> If a point is inside an area (the polygon composed of the area's >>>> boundaries), the distance is 0 (zero): >>> >>> This sentence makes me think that it is a priori known (based on >>> something >>> else - related to topology?) when a point is inside an area. Why all >>> the need >>> to measure distances then in order to count how many points are inside? >> >> As you can see in the code referenced by Markus, there is a >> Vect_point_in_area(), so yes, it is possible to more directly check if >> points are in areas. It all depends on which modules were written using >> this function. At this stage all point-in-polygon attempts in GRASS are >> scripts using workarounds... > > As a follow-up: > > The counting points in polygons algorithm I prefer at this stage is (using > municipal boundaries and hospitals in the NC data set with an SQLite backend > - DBF won't work): > > g.copy hospitals,myhospitals > v.db.addcol myhospitals col="cat_municip int" > v.distance from=myhospit...@sqlite to=boundary_mun...@permanent upload=cat > column=cat_municip dmax=0.0 > db.select sql="select cat_municip, count(*) from myhospitals group by > cat_municip" > > If your hospital attribute table contains number of beds (nbeds), the you > could sum the number of beds as such: > > db.select sql="select cat_municip, sum(nbeds) from myhospitals group by > cat_municip" > > etc... > > Using 6.5 to test a similar case to yours (I assume): > > g.region vect=boundary_municp > > v.random out=mypoints n=600000 > > v.db.addtable mypoints col="cat int, cat_municip int" (that's veeeeery slow, > probably because of 600000 update statements to the database in the v.to.db > call...) > > > time v.distance from=mypoi...@sqlite to=boundary_mun...@permanent upload=cat > column=cat_municip dmax=0.0 > > real 2m2.119s <= not so bad > > db.select sql="select cat_municip, count(*) from mypoints group by > cat_municip" > > So, using the combination of v.distance and db.select I cannot reproduce > your problem with 600,000 points, but maybe the number and nature of > polygons can also play a role... > > Moritz > _______________________________________________ grass-dev mailing list [email protected] http://lists.osgeo.org/mailman/listinfo/grass-dev
