#997: ST_Union performance degradation from GEOS 3.6.x -> 3.7.x -------------------------+-------------------------- Reporter: greenlaw | Owner: geos-devel@… Type: defect | Status: new Priority: major | Milestone: Component: Default | Version: 3.7.0 Severity: Significant | Keywords: union -------------------------+-------------------------- I [https://lists.osgeo.org/pipermail/postgis- users/2019-October/043688.html posted] about this earlier this week on the postgis-users mailing list.
I've encountered a severe performance issue with GEOS >=3.7.0 when calling `ST_Union(ST_Buffer(...))` on some large polygons (EPSG:3857) with lots of vertices. I've been testing primarily on PostgreSQL 9.5 but have reproduced the issue with multiple versions of PostGIS (2.2 to 2.5). This appears to be a separate issue from #867. That ticket references a recent JTS performance enhancement which was ported to GEOS and merged to master, but I tested with the latest version (which contains that fix), 3.8.0rc3, and it didn't make a difference (actually performance was worse). With PostgreSQL 9.5, PostGIS 2.3.7, and GEOS 3.6.4, my query takes ~100 sec to run, but after upgrading GEOS to 3.7.0, the query duration balloons to ~70 minutes. With 3.8.0rc3, duration again increased (to ~147 minutes). Exact timings for different version combinations is pasted below. I will attach a pgdump of the table (21 rows, 23mb uncompressed). Steps to reproduce: 1. Have PostgreSQL 9.5 instance with PostGIS 2.2 - 2.5 and GEOS >= 3.7.0 2. Import the pgdump {{{ psql -h hostname -p port -U postgres -d dbname -f pgdump_nws_haz_hang_201910.sql }}} 3. Execute the query {{{ EXPLAIN ANALYZE SELECT ST_Union(ST_Buffer(wkb_geometry, 100.0))::geometry(Geometry, 3857),category,vteccode,prod_type,validtime,starttime,endtime FROM public.nws_haz_hang_201910 GROUP BY category,vteccode,prod_type,validtime,starttime,endtime; }}} The query will likely take between 1 - 3 hours to complete on most systems. Timings for specific version combinations are pasted below. 3.6.1 = ~85 sec: {{{ POSTGIS="2.3.7 r16523" PGSQL="95" GEOS="3.6.1-CAPI-1.10.1 r0" PROJ="Rel. 4.9.3, 15 August 2016" GDAL="GDAL 1.11.4, released 2016/01/25" LIBXML="2.9.1" LIBJSON="0.11" RASTER EXPLAIN ANALYZE SELECT ST_Union(ST_Buffer(wkb_geometry, 100.0))::geometry(Geometry, 3857),category,vteccode,prod_type,validtime,starttime,endtime FROM public.nws_haz_hang_201910 GROUP BY category,vteccode,prod_type,validtime,starttime,endtime; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------ HashAggregate (cost=10.30..10.45 rows=10 width=706) (actual time=35922.055..85310.959 rows=5 loops=1) Group Key: category, vteccode, prod_type, validtime, starttime, endtime -> Seq Scan on nws_haz_hang_201910 (cost=0.00..10.10 rows=10 width=706) (actual time=0.014..0.053 rows=21 loops=1) Planning time: 0.460 ms Execution time: 85314.516 ms }}} 3.6.2 = ~90 sec: {{{ POSTGIS="2.3.7 r16523" PGSQL="95" GEOS="3.6.2-CAPI-1.10.2 4d2925d6" PROJ="Rel. 4.9.3, 15 August 2016" GDAL="GDAL 1.11.4, released 2016/01/25" LIBXML="2.9.1" LIBJSON="0.11" RASTER EXPLAIN ANALYZE SELECT ST_Union(ST_Buffer(wkb_geometry, 100.0))::geometry(Geometry, 3857),category,vteccode,prod_type,validtime,starttime,endtime FROM public.nws_haz_hang_201910 GROUP BY category,vteccode,prod_type,validtime,starttime,endtime; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------ HashAggregate (cost=10.30..10.45 rows=10 width=706) (actual time=34092.861..90393.216 rows=5 loops=1) Group Key: category, vteccode, prod_type, validtime, starttime, endtime -> Seq Scan on nws_haz_hang_201910 (cost=0.00..10.10 rows=10 width=706) (actual time=0.012..0.047 rows=21 loops=1) Planning time: 0.315 ms Execution time: 90393.369 ms }}} 3.6.3 = ~89 sec: {{{ POSTGIS="2.3.7 r16523" PGSQL="95" GEOS="3.6.3-CAPI-1.10.3 80c13047" PROJ="Rel. 4.9.3, 15 August 2016" GDAL="GDAL 1.11.4, released 2016/01/25" LIBXML="2.9.1" LIBJSON="0.11" RASTER EXPLAIN ANALYZE SELECT ST_Union(ST_Buffer(wkb_geometry, 100.0))::geometry(Geometry, 3857),category,vteccode,prod_type,validtime,starttime,endtime FROM public.nws_haz_hang_201910 GROUP BY category,vteccode,prod_type,validtime,starttime,endtime; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------ HashAggregate (cost=10.30..10.45 rows=10 width=706) (actual time=33833.006..88916.065 rows=5 loops=1) Group Key: category, vteccode, prod_type, validtime, starttime, endtime -> Seq Scan on nws_haz_hang_201910 (cost=0.00..10.10 rows=10 width=706) (actual time=0.011..0.047 rows=21 loops=1) Planning time: 0.306 ms Execution time: 88916.182 ms }}} 3.6.4 = ~100 sec: {{{ POSTGIS="2.3.7 r16523" PGSQL="95" GEOS="3.6.4-CAPI-1.10.4 ba90ca5" PROJ="Rel. 4.9.3, 15 August 2016" GDAL="GDAL 1.11.4, released 2016/01/25" LIBXML="2.9.1" LIBJSON="0.11" RASTER EXPLAIN ANALYZE SELECT ST_Union(ST_Buffer(wkb_geometry, 100.0))::geometry(Geometry, 3857),category,vteccode,prod_type,validtime,starttime,endtime FROM public.nws_haz_hang_201910 GROUP BY category,vteccode,prod_type,validtime,starttime,endtime; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------ HashAggregate (cost=10.30..10.45 rows=10 width=706) (actual time=35376.572..99728.566 rows=5 loops=1) Group Key: category, vteccode, prod_type, validtime, starttime, endtime -> Seq Scan on nws_haz_hang_201910 (cost=0.00..10.10 rows=10 width=706) (actual time=0.024..0.052 rows=21 loops=1) Planning time: 0.408 ms Execution time: 99728.803 ms }}} 3.7.0 = ~70 min: {{{ POSTGIS="2.3.7 r16523" PGSQL="95" GEOS="3.7.0-CAPI-1.11.0 673b9939" PROJ="Rel. 4.9.3, 15 August 2016" GDAL="GDAL 1.11.4, released 2016/01/25" LIBXML="2.9.1" LIBJSON="0.11" RASTER EXPLAIN ANALYZE SELECT ST_Union(ST_Buffer(wkb_geometry, 100.0))::geometry(Geometry, 3857),category,vteccode,prod_type,validtime,starttime,endtime FROM public.nws_haz_hang_201910 GROUP BY category,vteccode,prod_type,validtime,starttime,endtime; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------ HashAggregate (cost=10.30..10.45 rows=10 width=706) (actual time=30353.481..4204487.862 rows=5 loops=1) Group Key: category, vteccode, prod_type, validtime, starttime, endtime -> Seq Scan on nws_haz_hang_201910 (cost=0.00..10.10 rows=10 width=706) (actual time=0.017..0.053 rows=21 loops=1) Planning time: 0.362 ms Execution time: 4204488.069 ms }}} 3.7.1 = ~72 min: {{{ POSTGIS="2.3.7 r16523" PGSQL="95" GEOS="3.7.1-CAPI-1.11.1 27a5e771" PROJ="Rel. 4.9.3, 15 August 2016" GDAL="GDAL 1.11.4, released 2016/01/25" LIBXML="2.9.1" LIBJSON="0.11" RASTER EXPLAIN ANALYZE SELECT ST_Union(ST_Buffer(wkb_geometry, 100.0))::geometry(Geometry, 3857),category,vteccode,prod_type,validtime,starttime,endtime FROM public.nws_haz_hang_201910 GROUP BY category,vteccode,prod_type,validtime,starttime,endtime; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------ HashAggregate (cost=10.30..10.45 rows=10 width=706) (actual time=31342.188..4323753.205 rows=5 loops=1) Group Key: category, vteccode, prod_type, validtime, starttime, endtime -> Seq Scan on nws_haz_hang_201910 (cost=0.00..10.10 rows=10 width=706) (actual time=0.021..0.055 rows=21 loops=1) Planning time: 0.446 ms Execution time: 4323755.076 ms }}} 3.7.2 = ~70 min: {{{ POSTGIS="2.3.7 r16523" PGSQL="95" GEOS="3.7.2-CAPI-1.11.2 b55d2125" PROJ="Rel. 4.9.3, 15 August 2016" GDAL="GDAL 1.11.4, released 2016/01/25" LIBXML="2.9.1" LIBJSON="0.11" RASTER EXPLAIN ANALYZE SELECT ST_Union(ST_Buffer(wkb_geometry, 100.0))::geometry(Geometry, 3857),category,vteccode,prod_type,validtime,starttime,endtime FROM public.nws_haz_hang_201910 GROUP BY category,vteccode,prod_type,validtime,starttime,endtime; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------ HashAggregate (cost=10.30..10.45 rows=10 width=706) (actual time=33643.593..4195304.854 rows=5 loops=1) Group Key: category, vteccode, prod_type, validtime, starttime, endtime -> Seq Scan on nws_haz_hang_201910 (cost=0.00..10.10 rows=10 width=706) (actual time=0.019..0.052 rows=21 loops=1) Planning time: 0.374 ms Execution time: 4195305.021 ms }}} 3.8.0rc3 = ~147 min: {{{ POSTGIS="2.5.3 r17699" [EXTENSION] PGSQL="95" GEOS="3.8.0rc3-CAPI-1.13.1 " PROJ="Rel. 6.2.0, September 1st, 2019" GDAL="GDAL 3.0.1, released 2019/06/28" LIBXML="2.9.1" LIBJSON="0.11" LIBPROTOBUF="1.0.2" RASTER EXPLAIN ANALYZE SELECT ST_Union(ST_Buffer(wkb_geometry, 100.0))::geometry(Geometry, 3857),category,vteccode,prod_type,validtime,starttime,endtime FROM public.nws_haz_hang_201910 GROUP BY category,vteccode,prod_type,validtime,starttime,endtime; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------ HashAggregate (cost=10.30..10.45 rows=10 width=706) (actual time=42941.634..8834235.340 rows=5 loops=1) Group Key: category, vteccode, prod_type, validtime, starttime, endtime -> Seq Scan on nws_haz_hang_201910 (cost=0.00..10.10 rows=10 width=706) (actual time=0.016..0.048 rows=21 loops=1) Planning time: 0.198 ms Execution time: 8834235.446 ms }}} -- Ticket URL: <https://trac.osgeo.org/geos/ticket/997> GEOS <http://trac.osgeo.org/geos> GEOS (Geometry Engine - Open Source) is a C++ port of the Java Topology Suite (JTS).
_______________________________________________ geos-devel mailing list geos-devel@lists.osgeo.org https://lists.osgeo.org/mailman/listinfo/geos-devel