Jonas,
Thank you for the report. There is definitely a regression here and I think you are correct that it is in GEOS. Looks to be a regression between GEOS 3.6.1 and GEOS 3.6.2 so should be easy enough to nail down. If I run on windows PostGIS 2.3.2 GEOS 3.6.1 I get comparable speeds to your Windows – runs for me in 28.1 secs (but my box is just a workstation with 16 gb ram and vanilla conf). -- 28 seconds POSTGIS="2.3.2 r15302" GEOS="3.6.1-CAPI-1.10.1 r4317" PROJ="Rel. 4.9.1, 04 March 2015" GDAL="GDAL 2.1.3, released 2017/20/01" LIBXML="2.7.8" LIBJSON="0.12" RASTER PostgreSQL 9.6.8, compiled by Visual C++ build 1800, 64-bit If I swap out just the GEOS with my development 3.7.0 version, then time goes beyond a minute (I stopped it before it finished). If I upgrade to PostGIS 2.4.4 and swap back to my GEOS 3.6.1, time is 28.3 secs POSTGIS="2.4.4 r16526" PGSQL="96" GEOS="3.6.1-CAPI-1.10.1 r4317" PROJ="Rel. 4.9.3, 15 August 2016" GDAL="GDAL 2.1.3, released 2017/20/01" LIBXML="2.7.8" LIBJSON="0.12" LIBPROTOBUF="1.2.1" RASTER PostgreSQL 9.6.8, compiled by Visual C++ build 1800, 64-bit If I then switch that to GEOS 3.6.2 time balloons again to beyond 1 minute POSTGIS="2.4.4 r16526" PGSQL="96" GEOS="3.6.2-CAPI-1.10.2 4d2925d" PROJ="Rel. 4.9.3, 15 August 2016" GDAL="GDAL 2.1.3, released 2017/20/01" LIBXML="2.7.8" LIBJSON="0.12" LIBPROTOBUF="1.2.1" RASTER PostgreSQL 9.6.8, compiled by Visual C++ build 1800, 64-bit If you can ticket this that would be great and include the file you sent me as an attachment zipped. https://postgis.net/support/ You can flag it as a GEOS milestone bug, since it's not in PostGIS proper and needs to be fixed in GEOS. Thanks, Regina From: postgis-users [mailto:[email protected]] On Behalf Of Jonas Nygaard Pedersen Sent: Tuesday, April 24, 2018 7:58 AM To: 'PostGIS Users Discussion' <[email protected]> Subject: Re: [postgis-users] Performance problems with ST_Union on postgres 9.6, postgis 2.4.3 running on top of Red Hat 7.4 Hi Regina I’ve enclosed a sample dataset of 31489 rows (enclosed in separate email to Regina) that I hope can serve as a basis for more tests. It’s definitely not top secret, just building polygons for an area in Western Jutland, Denmark (the total dataset would be for all of Denmark). It should be noted that both the windows and redhat machine are specked with NVMe ssd disks that holds the postgres instance and data. The windows machine has 64 and the redhat has 128 GB of ram. Both have fairly recent workstation/server-grade cpus. Windows: 'POSTGIS="2.3.2 r15302" GEOS="3.5.0-CAPI-1.9.0 r4084" PROJ="Rel. 4.9.2, 08 September 2015" GDAL="GDAL 1.11.5, released 2016/07/01" LIBXML="2.9.4" TOPOLOGY RASTER' ‘PostgreSQL 9.6.3 on x86_64-pc-mingw64, compiled by gcc.exe (Rev5, Built by MSYS2 project) 4.9.2, 64-bit’ Red Hat 7.4: 'POSTGIS="2.4.3 r16312" PGSQL="96" GEOS="3.6.2-CAPI-1.10.2 4d2925d6" SFCGAL="1.2.2" PROJ="Rel. 4.9.3, 15 August 2016" GDAL="GDAL 1.11.4, released 2016/01/25" LIBXML="2.9.1" LIBJSON="0.11" RASTER' ‘PostgreSQL 9.6.8 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-16), 64-bit’ On the windows version the statement: select st_multi((st_dump(geom)).geom)::geometry(multipolygon,25832) as geom from ( select st_union(b.geom) as geom from temp_jonyp.st_union_test b ) foo ; Runs with a ‘Total query runtime: 19.5 secs 18423 rows retrieved.’ The same statement and data on the redhat machine runs with a ‘Total query runtime: 01:17 minutes 18423 rows retrieved.’ Both have almost identical execution plans: Subquery Scan on foo (cost=1428.62..1433.88 rows=1000 width=32) -> Aggregate (cost=1428.62..1428.62 rows=1 width=32) -> Seq Scan on st_union_test b (cost=0.00..1349.89 rows=31489 width=170) Both have been vacuumed and analyzed beforehand. And max_parallel_workers have been set to 0 on the redhat machine. I haven’t tried to do a postgis ticket, but let me know if you need that instead. Kind regards Jonas Fra: postgis-users [mailto:[email protected]] På vegne af Regina Obe Sendt: 24. april 2018 09:07 Til: 'PostGIS Users Discussion' <[email protected] <mailto:[email protected]> > Emne: Re: [postgis-users] Performance problems with ST_Union on postgres 9.6, postgis 2.4.3 running on top of Red Hat 7.4 Couple of things to try 1) Can you reduce the dataset down a bit to the point you can run it in a reasonable amount of time and see differences? 2) On the 2.4.3 plan I see it is running using parallel mode, can you turn off the parallel to rule that out as an issue. Set max_parallel_workers_per_gather = 0; I don't know what your 2.4.2 was running, but for 2.3, it probably wouldn't run in parallel just because fewer functions were flagged as parallel safe If you can reduce the set some something reasonable where we can test, I'd be happy to test on various OS/ PostGIS config. I wouldn't rule out a regression issue on our end. You can provide a sample on a postgis ticket, if it's not too top secret or just send directly to me. http://postgis.net/support/ Thanks, Regina From: postgis-users [mailto:[email protected]] On Behalf Of Jonas Nygaard Pedersen Sent: Monday, April 23, 2018 11:15 AM To: '[email protected]' <[email protected] <mailto:[email protected]> > Subject: [postgis-users] Performance problems with ST_Union on postgres 9.6, postgis 2.4.3 running on top of Red Hat 7.4 Hi list I'm facing some performance issues when trying to execute the following query on a Red Hat 7.4 machine with Postgres 9.6 (PostgreSQL 9.6.8 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-16), 64-bit) and postgis 2.4.3: select st_multi((st_dump(geom)).geom)::geometry(multipolygon,25832) as geom from ( select st_union(b.geom) as geom from b.gdk_bygning b ) foo; Previously the statement above finished in about 3.5 hours but now it seems to go on forever, and I'll cancel it after 3 days. The table contains 5310482 rows and the CREATE TABLE statement looks like this: CREATE TABLE b.gdk_bygning ( id_lokalid bigint, objektstatus character varying(4000), registreringfra timestamp(6) without time zone, virkningfra timestamp(6) without time zone, plannoejagtighed double precision, vertikalnoejagtighed double precision, bygninguuid character varying(4000), bygningstype character varying(4000), underminimumbygning character varying(5), status character varying(4000), geom geometry(Polygon,25832) ) Indexes: CREATE INDEX gdk_bygning_id_lokalid_idx ON b.gdk_bygning USING btree (id_lokalid); CREATE INDEX sidx_gdk_bygning_geom ON b.gdk_bygning USING gist (geom); I'm not sure where the issue stems from, but from an approximation of when the issue started and near coincidental update from postgis 2.4.1 to 2.4.3, I suspect that the issue is rooted here, and probably combined with some clumsy settings in my postgresql.conf file. Below I have compiled some documentation that I think will be relevant: * the original Postgis install with 'yum', * the update with 'yum', * what I think are the relevant settings from my postgresql.conf, * the query plan for the statement on the current 2.4.3 postgis version, * and lastly the query plan on a windows machine running postgis 2.3.2 (I'm not able to roll back to 2.4.1 with GEOS 3.5). The only thing that stands out to me is that the GEOS version is upgraded from 3.5.0 to 3.6.2 but I'm definitely unsure about what's going on and I hope that someone on the list can give me some advice. Regards Jonas OUTPUT OF 'yum history info' FOR ORIGINAL INSTALL OF POSTGIS: Loaded plugins: langpacks, product-id, rhnplugin, search-disabled-repos, subscription-manager This system is receiving updates from RHN Classic or Red Hat Satellite. Transaction ID : 7 Begin time : Mon Oct 9 17:56:25 2017 Begin rpmdb : 1379:99b8afbfabf2cf72ff17087d17a2a1607e29a909 End time : 17:56:35 2017 (10 seconds) End rpmdb : 1421:bd1fa883ae52216121f1a7c6a2eb06d2d6e36075 User : <b031513> Return-Code : Success Command Line : install postgis24_96.x86_64 postgis24_96-client.x86_64 postgis24_96-devel.x86_64 postgis24_96-utils.x86_64 SFCGAL.x86_64 pgrouting_96.x86_64 Transaction performed with: Installed rpm-4.11.3-25.el7.x86_64 @rhel-x86_64-server-7 Updated subscription-manager-1.19.21-1.el7.x86_64 @rhel-x86_64-server-7 Installed yum-3.4.3-154.el7.noarch @rhel-x86_64-server-7 Installed yum-metadata-parser-1.1.4-10.el7.x86_64 @anaconda/7.2 Installed yum-rhn-plugin-2.0.1-9.el7.noarch @rhel-x86_64-server-7 Packages Altered: Dep-Install CGAL-4.7-1.rhel7.x86_64 @pgdg96 Dep-Install CharLS-1.0-5.el7.x86_64 @rhel-x86_64-server-7-epel Install SFCGAL-1.2.2-1.rhel7.x86_64 @pgdg96 Dep-Install SFCGAL-libs-1.2.2-1.rhel7.x86_64 @pgdg96 Dep-Install armadillo-4.320.0-1.el7.x86_64 @rhel-x86_64-server-7-epel Dep-Install arpack-3.1.3-2.el7.x86_64 @rhel-x86_64-server-7-epel Dep-Install atlas-3.10.1-12.el7.x86_64 @rhel-x86_64-server-7 Dep-Install blas-3.4.2-8.el7.x86_64 @rhel-x86_64-server-7 Dep-Install boost-atomic-1.53.0-27.el7.x86_64 @rhel-x86_64-server-7 Dep-Install boost-chrono-1.53.0-27.el7.x86_64 @rhel-x86_64-server-7 Dep-Install boost-serialization-1.53.0-27.el7.x86_64 @rhel-x86_64-server-7 Dep-Install cfitsio-3.370-1.el7.x86_64 @rhel-x86_64-server-7-epel Dep-Install freexl-1.0.0i-1.el7.x86_64 @rhel-x86_64-server-7-epel Dep-Install gdal-libs-1.11.4-10.rhel7.x86_64 @pgdg96 Dep-Install geos-3.5.0-1.rhel7.x86_64 @pgdg96 Dep-Install hdf5-1.8.12-8.el7.x86_64 @rhel-x86_64-server-7-epel Dep-Install lapack-3.4.2-8.el7.x86_64 @rhel-x86_64-server-7 Dep-Install libdap-3.13.1-2.el7.x86_64 @rhel-x86_64-server-7-epel Dep-Install libgeotiff-1.4.0-1.rhel7.x86_64 @pgdg96 Dep-Install libgfortran-4.8.5-16.el7.x86_64 @rhel-x86_64-server-7 Dep-Install libgta-1.0.4-1.el7.x86_64 @rhel-x86_64-server-7-epel Dep-Install libquadmath-4.8.5-16.el7.x86_64 @rhel-x86_64-server-7 Dep-Install netcdf-4.3.3.1-5.el7.x86_64 @rhel-x86_64-server-7-epel Dep-Install ogdi-3.2.0-0.19.beta2.el7.x86_64 @rhel-x86_64-server-7-epel Dep-Install openjpeg2-2.1.0-7.el7.x86_64 @rhel-x86_64-server-7-epel Dep-Install perl-Compress-Raw-Bzip2-2.061-3.el7.x86_64 @rhel-x86_64-server-7 Dep-Install perl-Compress-Raw-Zlib-1:2.061-4.el7.x86_64 @rhel-x86_64-server-7 Dep-Install perl-DBD-Pg-2.19.3-4.el7.x86_64 @rhel-x86_64-server-7 Dep-Install perl-DBI-1.627-4.el7.x86_64 @rhel-x86_64-server-7 Dep-Install perl-Data-Dumper-2.145-3.el7.x86_64 @rhel-x86_64-server-7 Dep-Install perl-IO-Compress-2.061-2.el7.noarch @rhel-x86_64-server-7 Dep-Install perl-Net-Daemon-0.48-5.el7.noarch @rhel-x86_64-server-7 Dep-Install perl-PlRPC-0.2020-14.el7.noarch @rhel-x86_64-server-7 Dep-Install perl-version-3:0.99.07-2.el7.x86_64 @rhel-x86_64-server-7 Install pgrouting_96-2.5.0-1.rhel7.x86_64 @pgdg96 Install postgis24_96-2.4.0-1.rhel7.x86_64 @pgdg96 Install postgis24_96-client-2.4.0-1.rhel7.x86_64 @pgdg96 Install postgis24_96-devel-2.4.0-1.rhel7.x86_64 @pgdg96 Install postgis24_96-utils-2.4.0-1.rhel7.x86_64 @pgdg96 Dep-Install proj-4.8.0-4.el7.x86_64 @rhel-x86_64-server-7-epel Dep-Install unixODBC-2.3.1-11.el7.x86_64 @rhel-x86_64-server-7 Dep-Install xerces-c-3.1.1-8.el7_2.x86_64 @rhel-x86_64-server-7 history info OUTPUT OF 'yum history info' FOR THE UPDATE: Loaded plugins: langpacks, product-id, rhnplugin, search-disabled-repos, subscription-manager This system is receiving updates from RHN Classic or Red Hat Satellite. Transaction ID : 35 Begin time : Tue Apr 3 11:30:58 2018 Begin rpmdb : 1748:db6ede4f0b0b9815a1f8704452181b47f0a32796 End time : 11:31:04 2018 (6 seconds) End rpmdb : 1749:83a1f934015ebd6db6adc07214b1937477780af4 User : <b031513> Return-Code : Success Command Line : update postgis Transaction performed with: Installed rpm-4.11.3-25.el7.x86_64 @rhel-x86_64-server-7 Installed subscription-manager-1.19.23-1.el7_4.x86_64 @rhel-x86_64-server-7 Installed yum-3.4.3-154.el7.noarch @rhel-x86_64-server-7 Installed yum-metadata-parser-1.1.4-10.el7.x86_64 @anaconda/7.2 Installed yum-rhn-plugin-2.0.1-9.el7.noarch @rhel-x86_64-server-7 Packages Altered: Dep-Install geos36-3.6.2-3.1.rhel7.x86_64 @pgdg96 Updated postgis24_96-2.4.1-1.rhel7.x86_64 @pgdg96 Update 2.4.3-1.rhel7.x86_64 @pgdg96 Updated postgis24_96-client-2.4.1-1.rhel7.x86_64 @pgdg96 Update 2.4.3-1.rhel7.x86_64 @pgdg96 Updated postgis24_96-devel-2.4.1-1.rhel7.x86_64 @pgdg96 Update 2.4.3-1.rhel7.x86_64 @pgdg96 Updated postgis24_96-utils-2.4.1-1.rhel7.x86_64 @pgdg96 Update 2.4.3-1.rhel7.x86_64 @pgdg96 history info WHAT I THINK ARE THE RELEVANT OPTIONS IN MY 'postgresql.conf' THAT ARE DIFFERENT FROM THE ONE THAT CAME WITH 9.6 FROM 'pgdg96' REPOSITORY: shared_buffers = 50GB # min 128kB # (change requires restart) #huge_pages = try # on, off, or try # (change requires restart) #temp_buffers = 8MB # min 800kB #max_prepared_transactions = 0 # zero disables the feature # (change requires restart) work_mem = 5GB # min 64kB maintenance_work_mem = 5GB #replacement_sort_tuples = 150000 # limits use of replacement selection sort #autovacuum_work_mem = -1 # min 1MB, or -1 to use maintenance_work_mem #max_stack_depth = 2MB # min 100kB dynamic_shared_memory_type = posix # the default is the first option effective_io_concurrency = 200 # 1-1000; 0 disables prefetching max_worker_processes = 64 # (change requires restart) max_parallel_workers_per_gather = 12 # taken from max_worker_processes seq_page_cost = 1.0 # measured on an arbitrary scale random_page_cost = 2.0 # same scale as above #cpu_tuple_cost = 0.01 # same scale as above #cpu_index_tuple_cost = 0.005 # same scale as above #cpu_operator_cost = 0.0025 # same scale as above parallel_tuple_cost = 0.001 # same scale as above parallel_setup_cost = 100.0 # same scale as above #min_parallel_relation_size = 8MB effective_cache_size = 90GB QUERY PLAN AFTER UPDATE TO POSTGIS 2.4.3 ('POSTGIS="2.4.3 r16312" PGSQL="96" GEOS="3.6.2-CAPI-1.10.2 4d2925d6" SFCGAL="1.2.2" PROJ="Rel. 4.9.3, 15 August 2016" GDAL="GDAL 1.11.4, released 2016/01/25" LIBXML="2.9.1" LIBJSON="0.11" RASTER') QUERY PLAN ----------------------------------------------------------------------------------------------------- Subquery Scan on foo (cost=231732.65..231737.92 rows=1000 width=32) -> Aggregate (cost=231732.65..231732.66 rows=1 width=32) -> Gather (cost=100.00..218456.45 rows=5310482 width=169) Workers Planned: 5 -> Parallel Seq Scan on gdk_bygning b (cost=0.00..213045.96 rows=1062096 width=169) (5 rows) QUERY PLAN ON WINDOWS MACHINE ('POSTGIS="2.3.2 r15302" GEOS="3.5.0-CAPI-1.9.0 r4084" PROJ="Rel. 4.9.2, 08 September 2015" GDAL="GDAL 1.11.5, released 2016/07/01" LIBXML="2.9.4" TOPOLOGY RASTER'): QUERY PLAN ------------------------------------------------------------------------------------- Subquery Scan on foo (cost=216558.28..216563.55 rows=1000 width=32) -> Aggregate (cost=216558.28..216558.29 rows=1 width=32) -> Seq Scan on gdk_bygning b (cost=0.00..204498.02 rows=4824102 width=175) Jonas Nygaard Pedersen │Geodataanalytiker │Eff – Effektivisering│Tel. 7254 5510│[email protected]
_______________________________________________ postgis-users mailing list [email protected] https://lists.osgeo.org/mailman/listinfo/postgis-users
