Hi

________________________________

>From: Alvaro Herrera <alvhe...@2ndquadrant.com>

>Sent: Wednesday, February 19, 2020 4:23 PM

>To: Lars Aksel Opsahl <lars.ops...@nibio.no>

>Cc: Laurenz Albe <laurenz.a...@cybertec.at>; Pavel Stehule 
><pavel.steh...@gmail.com>; Tom Lane <t...@sss.pgh.pa.us>; 
>pgsql-performance@lists.postgresql.org <pgsql-performance@lists.postgresql.org>

>Subject: Re: SubtransControlLock and performance problems

>

>On 2020-Feb-19, Lars Aksel Opsahl wrote:

>

>> With the values above I did see same performance problems and we ended

>> with a lot of subtransControlLock.

>>

>> So I started to change the code based on your feedbacks.

>>

>> - What seems to work very good in combination with a catch exception

>> and retry pattern is to insert the data in to separate table for each

>> job. (I the current testcase we reduced the number of

>> subtransControlLock from many hundreds to almost none.)

>

>I think at this point your only recourse is to start taking profiles to

>see where the time is going.  Without that, you're just flying blind and

>whatever you do will not necessarily move your needle at all.


Hi

Yes I totally agree with you and yes I have tried to do some profiling and 
testing while developing.

>From the worst case to best case the time is reduced 15 times (from 300 
>minutes to 20 minutes) when testing a small dataset for with 619230 surface 
>(25909671 total line points) with the test below 
>“resolve_overlap_gap_run('org_jm.jm_ukomm_flate','figurid','geo',4258,false,'test_topo_jm',0.000001,31,3000);
> “

The reason for this seems to be related to the problems described by Laurenz 
Albe related to how Postgres handles try and catch and sub transactions, which 
I did not know about. If we don't have this is mind and we start to get 
subtranslocks it seems to kill the performance in some cases.

In this test I ran with 31 parallel threads which is very high on a server with 
only 32 cores and maybe not realistic. I just did this now see what happens 
when I try to push a server to it’s limits and maximise the performance 
increase. If I reduce this to 1 single thread, there should be now difference 
and if run on 16 threads the difference would much much smaller.

I will now start to run on datasets which are 10 times bigger to check how 
thing scales, but then run with around maybe 28 parallel jobs.

The two branches I have tested on now which should show the main difference are 
here.

https://github.com/larsop/resolve-overlap-and-gap/tree/add_postgis_topology_using_TopoGeo_addLinestringwhich
 is the faster one.

https://github.com/larsop/resolve-overlap-and-gap/tree/add_postgis_topology 
which is slower one, but here I have now added a check on number of 
subtranslocks before I kick of new jobs and that reduced time form 9 hours  to 
3 hours.


Thanks.


Lars






Reply via email to