Sadly I think it can't be done with pure plpgsql, because every function is wrapped in a transaction no matter what. You can only do it using the trick to connect to the same database from within with the extension "dblink" But I find difficult to understand how transactions and sub transactions affects performance.
Morevover , the transaction think is not the only problem. It is more a design problem. Even with CGAL, building topology one by one or with the batch mode changes radically the time of building ( n to n^2 at least). That's why I truly think perf is going to come from a batch mode. Tweaking the current process is just damage control in my opinion. This is not so hard do to if we rely a bit on GEOS. 1. cut the input geom into a space partition (for line, ST_Node, for poly ST_Polygonize) 2. populate node table, and create a temp table with list of line for each node 3. Populate edge_data 4. fill next / left for edge_data 5. compute area (Polygonize, Geos? ) 6. Map the input geom to generated topology (to be able to use attributes) I already tested 1,2,3,6. It can be fast (not to that building full topo in geos and converting it to postgis_topology I'm afraid), and it will scale very well. Cheers, Rémi-C 2014-11-20 12:24 GMT+01:00 Sandro Santilli <[email protected]>: > On Wed, Nov 19, 2014 at 04:47:48PM +0100, Sandro Santilli wrote: > > On Wed, Nov 19, 2014 at 12:50:09PM +0100, Rémi Cura wrote: > > > > > > Adding one feature is actually quite fast, even on already big > topology. > > > > > > Its when you want to add a lot's that it becomes increasingly slow > (maybe > > > because indexes are not updated,or because we are in one transaction?) > > > > > > The slowing seems to be very non linear, probably following n^2, where > n is > > > the number of feature already constructed in the transaction. > > > > An issue with index use was recently fixed. > > There might be another one hiding somewhere. > > On a closer look, I'm thinking the single-transaction is what commonly > hits during topology building (UPDATE .. SET tg = toTopoGeom ..) > > Starting from an empty topology and running a single statement > invoking toTopoGeom for each of many inputs result in no stats ever > being visible by the planner within the transaction. In turn this > is likely to opt for sequencial scans (an empty table is quicker to > scan sequencially). > > This would explain why populating in chunks works better, using > a transaction for each chunk > (UPDATE .. SET .. WHERE gid >= N AND gid < N+chunksize) > > It could be interesting to try a wrapper function taking care of > running ANALYZE on the primitive tables every N calls to toTopoGeom > (or N primitives being created, regardless of number of simple inputs). > > --strk; > > () ASCII ribbon campaign -- Keep it simple ! > /\ http://strk.keybit.net/rants/ascii_mails.txt > _______________________________________________ > postgis-users mailing list > [email protected] > http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users >
_______________________________________________ postgis-users mailing list [email protected] http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users
