Re: [postgis-users] merging geometries of buffer on large data set

Martin Davis Sun, 30 Jun 2013 10:14:39 -0700

Hi, Karsten.

One potential problem with merging all the buffers for a given value isthat you may wind up with a very large polygon that creates its ownhandling issues.

If you only need this for analytical purposes, not for display, then itmight work better to aim for a middle ground. You could merge thebuffers in "clumps". To do this, you need to partition the buffers intospatially-coherent groups. A simple way to do this is to create a gridover the area (say based on the coordinate system you are using).Create a reprsentative point for each buffer (e.g. the interior point orcentroid). The buffers can then be assigned to the grid cell theirpoint lies in. Then group the buffers by their grid cell, and unioneach group.

Actually, if you want to carry on and create single polygons, then theclumped buffer polygons give a better basis to work from (since theyshould have many fewer points than the source polygons).

If there are still memory issues with the clumped unions, then you cancarry out this process in an iterated fashion, starting with small gridcells and then repeating with a larger size. (At this point you willhave basically reimplemented the GEOS CascadedUnion algorithm. Butthat's ok - performing the algorithm at the SQL level allows morecontrol over memory and processing usage).


I'd be interesting to hear if this approach works out.

Martin

On 6/28/2013 5:11 PM, karsten vennemann wrote:

I was wondering if there is a good (or best practice) approach on howto merge geometry features that are touching or overlapping and haveone common value in one table field.
Here is what I was trying to do: given a large dataset such as the(detailed NHD data layer) of rivers in California I created multiplebuffers and inserted the results into a new table with one geometrycolumn adding a score value to each of the same buffers distancesused. Thus the buffer polygon layer has a score with a value of10,100,500 and 1000 m corresponding to the buffer distance used. Giventhe approach I used to create the buffers those are often spatiallyoverlapping (because there was no merge operation of the buffers andbecause the rivers are split along the flow line in multiple segmentsby node in the source shape file). The resulting layer works ok for mypurposes (which is to retrieve information in which buffers a certainlocation is intersecting it with the river buffer (results can be10,100,500 and 1000 or no intersect with the buffers).
Now the layers is about 20 GB big disk size having a lot ofunnecessary geometries with are overlapping.
How can I go about merging all the existing geometries on this hugedata set into a result layer that has (optimally ) only 4 polygonswith the result scores to find my intersects.
When I tried some of my own approaches (e.g. using st_collect andsuch to do this) so far whenever I started these sever resourceintensive operations soon these where aborted by the system because igot some kind of out of memeonry errosr on my server (an ubuntuachjien). Is there a good way to optimize this kind of query operationwithout using 100% of my server ram so that I will not run out ofmemory or resulting in a lengthy query that would be running for 6weeks or so J?
Any query examples or general  insight are  greatly appreciated .

Cheers

Karsten

Karsten Vennemann

Terra GIS Ltd

2119 Boyer Ave E

Seattle, WA 98112

USA

tel ++ 206 905 1711

fax ++ 925 905 1711



_______________________________________________
postgis-users mailing list
[email protected]
http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users


No virus found in this message.
Checked by AVG - www.avg.com <http://www.avg.com>
Version: 2013.0.2904 / Virus Database: 3184/6359 - Release Date: 05/26/13
Internal Virus Database is out of date.

_______________________________________________
postgis-users mailing list
[email protected]
http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users

Re: [postgis-users] merging geometries of buffer on large data set

Reply via email to