Re: SubtransControlLock and performance problems

Lars Aksel Opsahl Thu, 20 Feb 2020 01:21:37 -0800

Hi

________________________________

>From: Alvaro Herrera <[email protected]>

>Sent: Wednesday, February 19, 2020 4:23 PM

>To: Lars Aksel Opsahl <[email protected]>

>Cc: Laurenz Albe <[email protected]>; Pavel Stehule 
><[email protected]>; Tom Lane <[email protected]>; 
>[email protected] <[email protected]>

>Subject: Re: SubtransControlLock and performance problems

>

>On 2020-Feb-19, Lars Aksel Opsahl wrote:

>

>> With the values above I did see same performance problems and we ended

>> with a lot of subtransControlLock.

>>

>> So I started to change the code based on your feedbacks.

>>

>> - What seems to work very good in combination with a catch exception

>> and retry pattern is to insert the data in to separate table for each

>> job. (I the current testcase we reduced the number of

>> subtransControlLock from many hundreds to almost none.)

>

>I think at this point your only recourse is to start taking profiles to

>see where the time is going.  Without that, you're just flying blind and

>whatever you do will not necessarily move your needle at all.

Hi

Yes I totally agree with you and yes I have tried to do some profiling and 
testing while developing.

>From the worst case to best case the time is reduced 15 times (from 300 
>minutes to 20 minutes) when testing a small dataset for with 619230 surface 
>(25909671 total line points) with the test below 
>“resolve_overlap_gap_run('org_jm.jm_ukomm_flate','figurid','geo',4258,false,'test_topo_jm',0.000001,31,3000);
> “

The reason for this seems to be related to the problems described by Laurenz 
Albe related to how Postgres handles try and catch and sub transactions, which 
I did not know about. If we don't have this is mind and we start to get 
subtranslocks it seems to kill the performance in some cases.

In this test I ran with 31 parallel threads which is very high on a server with 
only 32 cores and maybe not realistic. I just did this now see what happens 
when I try to push a server to it’s limits and maximise the performance 
increase. If I reduce this to 1 single thread, there should be now difference 
and if run on 16 threads the difference would much much smaller.

I will now start to run on datasets which are 10 times bigger to check how 
thing scales, but then run with around maybe 28 parallel jobs.

The two branches I have tested on now which should show the main difference are 
here.

https://github.com/larsop/resolve-overlap-and-gap/tree/add_postgis_topology_using_TopoGeo_addLinestringwhich
 is the faster one.

https://github.com/larsop/resolve-overlap-and-gap/tree/add_postgis_topology 
which is slower one, but here I have now added a check on number of 
subtranslocks before I kick of new jobs and that reduced time form 9 hours  to 
3 hours.

Thanks.

Lars

Re: SubtransControlLock and performance problems

Reply via email to