Re: undetected deadlock in ALTER SUBSCRIPTION ... REFRESH PUBLICATION

2023-12-11 Thread Tomas Vondra
On 12/11/23 07:12, Amit Kapila wrote:
> On Sat, Dec 9, 2023 at 12:16 PM Amit Kapila  wrote:
>>
>> Thanks, I could also reproduce the issue on back branches (tried till
>> 12), and the fix works. I'll push this on Monday.
>>
> 
> Peter sent one minor suggestion (to write the check differently for
> easier understanding) offlist which I addressed and pushed the patch.
> 

Thanks for taking care of fixing this!


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: undetected deadlock in ALTER SUBSCRIPTION ... REFRESH PUBLICATION

2023-12-10 Thread Amit Kapila
On Sat, Dec 9, 2023 at 12:16 PM Amit Kapila  wrote:
>
> Thanks, I could also reproduce the issue on back branches (tried till
> 12), and the fix works. I'll push this on Monday.
>

Peter sent one minor suggestion (to write the check differently for
easier understanding) offlist which I addressed and pushed the patch.

-- 
With Regards,
Amit Kapila.




Re: undetected deadlock in ALTER SUBSCRIPTION ... REFRESH PUBLICATION

2023-12-08 Thread Amit Kapila
On Fri, Dec 8, 2023 at 7:16 PM Shlok Kyal  wrote:
>
> > Then let's go with the original patch only. BTW, it took almost the
> > same time (105 wallclock secs) in my environment (CentOs VM) to run
> > tests in src/test/subscription both with and without the patch. I took
> > a median of five runs. I have slightly adjusted the comments and
> > commit message in the attached. If you are fine with this, we can
> > commit and backpatch this.
>
> I have tested the patch for all the branches from PG 17 to PG 12.
> The same patch applies cleanly on all branches. Also, the same patch
> resolves the issue on all the branches.
> I ran all the tests and all the tests passed on each branch.
>
> I also reviewed the patch and it looks good to me.
>

Thanks, I could also reproduce the issue on back branches (tried till
12), and the fix works. I'll push this on Monday.

-- 
With Regards,
Amit Kapila.




Re: undetected deadlock in ALTER SUBSCRIPTION ... REFRESH PUBLICATION

2023-12-08 Thread Shlok Kyal
Hi,

> Then let's go with the original patch only. BTW, it took almost the
> same time (105 wallclock secs) in my environment (CentOs VM) to run
> tests in src/test/subscription both with and without the patch. I took
> a median of five runs. I have slightly adjusted the comments and
> commit message in the attached. If you are fine with this, we can
> commit and backpatch this.

I have tested the patch for all the branches from PG 17 to PG 12.
The same patch applies cleanly on all branches. Also, the same patch
resolves the issue on all the branches.
I ran all the tests and all the tests passed on each branch.

I also reviewed the patch and it looks good to me.

Thanks and Regards,
Shlok Kyal




Re: undetected deadlock in ALTER SUBSCRIPTION ... REFRESH PUBLICATION

2023-12-08 Thread Amit Kapila
On Thu, Dec 7, 2023 at 11:21 AM Shlok Kyal  wrote:
>
> > I mean to commit the open transaction at the below place in
> > wait_for_relation_state_change()
> >
> > wait_for_relation_state_change()
> > {
> > ...
> > -- commit the xact
> > WaitLatch();
> > ...
> > }
> >
> > Then start after the wait is over. This is just to test whether it
> > improves the difference in regression test timing.
>
> I tried the above approach and observed that the performance of this
> approach is nearly same as the previous approach.
>

Then let's go with the original patch only. BTW, it took almost the
same time (105 wallclock secs) in my environment (CentOs VM) to run
tests in src/test/subscription both with and without the patch. I took
a median of five runs. I have slightly adjusted the comments and
commit message in the attached. If you are fine with this, we can
commit and backpatch this.

-- 
With Regards,
Amit Kapila.


v3-0001-Fix-an-undetected-deadlock-due-to-apply-worker.patch
Description: Binary data


Re: undetected deadlock in ALTER SUBSCRIPTION ... REFRESH PUBLICATION

2023-12-06 Thread Shlok Kyal
Hi,

> I mean to commit the open transaction at the below place in
> wait_for_relation_state_change()
>
> wait_for_relation_state_change()
> {
> ...
> -- commit the xact
> WaitLatch();
> ...
> }
>
> Then start after the wait is over. This is just to test whether it
> improves the difference in regression test timing.

I tried the above approach and observed that the performance of this
approach is nearly same as the previous approach.

For Linux VM:
Summary| Subscription | 100 tables in pub  | 1000 tables in pub
| Test (sec) | and Sub (sec)   | and Sub (sec)
--
old patch |  107.4545 |   6.911   | 77.918
alternate  |  108.3985| 6.9835   | 78.111
approach

For Performance Machine:
Summary| Subscription | 100 tables in pub  | 1000 tables in pub
| Test (sec) | and Sub (sec)   | and Sub (sec)
--
old patch |  115.922  |   6.7305 | 81.1525
alternate  |  115.8215   | 6.7685| 81.2335
approach

I have attached the patch for this approach as 'alternate_approach.patch'.

Since the performance is the same, I think that the previous approach
is better. As in this approach we are using CommitTransactionCommand()
and StartTransactionCommand() inside a 'for loop'.

I also fixed the comment in previous approach and attached here as
'v2-0001-Deadlock-when-apply-worker-tablesync-worker-and-c.patch'

Thanks and Regards


Shlok Kyal
From 11072d138d900227b963b54d1a3626cf256db721 Mon Sep 17 00:00:00 2001
From: Shlok Kyal 
Date: Fri, 24 Nov 2023 16:37:14 +0530
Subject: [PATCH v2] Deadlock when apply worker,tablesync worker and client
 backend run parallelly.

Apply worker holds a lock on the table pg_subscription_rel and waits for
notification from the tablesync workers where the relation is synced, which
happens through updating the pg_subscription_rel row. And that wait happens in
wait_for_relation_state_change, which simply checks the row in a loop, with a
sleep by WaitLatch(). Unfortunately, the tablesync workers can't update the row
because the client backend executing ALTER SUBSCRIPTION ... REFRESH PUBLICATION
sneaked in, and waits for an AccessExclusiveLock. So the tablesync workers are
stuck in the queue and can't proceed.

The client backend is waiting for a lock held by the apply worker. The tablesync
workers can't proceed because their lock request is stuck behind the
AccessExclusiveLock request. And the apply worker is waiting for status update
from the tablesync workers. And the deadlock is undetected, because the apply
worker is not waiting on a lock, but sleeping on a latch.

We have resolved the issue by releasing the lock by commiting the transaction
before the apply worker goes to wait state.
---
 src/backend/replication/logical/tablesync.c | 17 ++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index df3c42eb5d..b71234d998 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -542,15 +542,26 @@ process_syncing_tables_for_apply(XLogRecPtr current_lsn)
 	LWLockRelease(LogicalRepWorkerLock);
 
 	/*
-	 * Enter busy loop and wait for synchronization worker to
-	 * reach expected state (or die trying).
+	 * If we have a transaction, we must commit it to release
+	 * any locks we have. Then start a new transaction so we 
+	 * can examine catalog state.
 	 */
-	if (!started_tx)
+	if (started_tx)
+	{
+		CommitTransactionCommand();
+		pgstat_report_stat(false);
+		StartTransactionCommand();
+	}
+	else
 	{
 		StartTransactionCommand();
 		started_tx = true;
 	}
 
+	/*
+	 * Enter busy loop and wait for synchronization worker to
+	 * reach expected state (or die trying).
+	 */
 	wait_for_relation_state_change(rstate->relid,
    SUBREL_STATE_SYNCDONE);
 }
-- 
2.34.1

From ce261399d22a7fd9ee7ccdd3b9d1cc7f4f72ce4b Mon Sep 17 00:00:00 2001
From: Shlok Kyal 
Date: Thu, 7 Dec 2023 10:55:26 +0530
Subject: [PATCH] Deadlock when apply worker,tablesync worker and client
 backend run parallelly.

Apply worker holds a lock on the table pg_subscription_rel and waits for
notification from the tablesync workers where the relation is synced, which
happens through updating the pg_subscription_rel row. And that wait happens in
wait_for_relation_state_change, which simply checks the row in a loop, with a
sleep by WaitLatch(). Unfortunately, the tablesync workers can't update the row
because the client backend executing ALTER SUBSCRIPTION ... REFRESH PUBLICATION
sneaked in, and waits for an AccessExclusiveLock. So the tablesync workers are
stuck in the queue and can't proceed.

The 

Re: undetected deadlock in ALTER SUBSCRIPTION ... REFRESH PUBLICATION

2023-12-05 Thread Shlok Kyal
On Tue, 5 Dec 2023 at 17:18, Tomas Vondra  wrote:
>
> On 12/5/23 08:14, Shlok Kyal wrote:
> > Hi,
> >
> >> As for the test results, I very much doubt the differences are not
> >> caused simply by random timing variations, or something like that. And I
> >> don't understand what "Performance Machine Linux" is, considering those
> >> timings are slower than the other two machines.
> >
> > The machine has Total Memory of 755.536 GB, 120 CPUs and RHEL 7 Operating 
> > System
> > Also find the detailed info of the performance machine attached.
> >
>
> Thanks for the info. I don't think the tests really benefit from this
> much resources, I would be rather surprised if it was faster beyond 8
> cores or so. The CPU frequency likely matters much more. Which probably
> explains why this machine was the slowest.
>
> Also, I wonder how much the results vary between the runs. I suppose you
> only did s single run for each, right?

I did 10 runs for each of the cases and reported the median result in
the previous thread.
I have documented the result of the runs and have attached the same here.

Thanks and Regards,
Shlok Kyal


Performance_Test.xlsx
Description: MS-Excel 2007 spreadsheet


Re: undetected deadlock in ALTER SUBSCRIPTION ... REFRESH PUBLICATION

2023-12-05 Thread Tomas Vondra
On 12/5/23 08:14, Shlok Kyal wrote:
> Hi,
> 
>> As for the test results, I very much doubt the differences are not
>> caused simply by random timing variations, or something like that. And I
>> don't understand what "Performance Machine Linux" is, considering those
>> timings are slower than the other two machines.
> 
> The machine has Total Memory of 755.536 GB, 120 CPUs and RHEL 7 Operating 
> System
> Also find the detailed info of the performance machine attached.
> 

Thanks for the info. I don't think the tests really benefit from this
much resources, I would be rather surprised if it was faster beyond 8
cores or so. The CPU frequency likely matters much more. Which probably
explains why this machine was the slowest.

Also, I wonder how much the results vary between the runs. I suppose you
only did s single run for each, right?

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: undetected deadlock in ALTER SUBSCRIPTION ... REFRESH PUBLICATION

2023-12-04 Thread Shlok Kyal
Hi,

> As for the test results, I very much doubt the differences are not
> caused simply by random timing variations, or something like that. And I
> don't understand what "Performance Machine Linux" is, considering those
> timings are slower than the other two machines.

The machine has Total Memory of 755.536 GB, 120 CPUs and RHEL 7 Operating System
Also find the detailed info of the performance machine attached.

Thanks and Regards,
Shlok Kyal
MemTotal: 755.536 GB
MemFree: 748.281 GB
MemAvailable: 748.581 GB
Buffers: 0.002 GB
Cached: 1.366 GB
SwapCached: 0.000 GB
Active: 0.834 GB
Inactive: 0.745 GB
Active(anon): 0.221 GB
Inactive(anon): 0.028 GB
Active(file): 0.614 GB
Inactive(file): 0.717 GB
Unevictable: 0.000 GB
Mlocked: 0.000 GB
SwapTotal: 4.000 GB
SwapFree: 4.000 GB
Dirty: 0.000 GB
Writeback: 0.000 GB
AnonPages: 0.212 GB
Mapped: 0.142 GB
Shmem: 0.038 GB
Slab: 0.468 GB
SReclaimable: 0.139 GB
SUnreclaim: 0.329 GB
KernelStack: 0.020 GB
PageTables: 0.023 GB
NFS_Unstable: 0.000 GB
Bounce: 0.000 GB
WritebackTmp: 0.000 GB
CommitLimit: 381.768 GB
Committed_AS: 0.681 GB
VmallocTotal: 32768.000 GB
VmallocUsed: 1.852 GB
VmallocChunk: 32189.748 GB
Percpu: 0.035 GB
HardwareCorrupted: 0.000 GB
AnonHugePages: 0.025 GB
CmaTotal: 0.000 GB
CmaFree: 0.000 GB
HugePages_Total: 0.000 GB
HugePages_Free: 0.000 GB
HugePages_Rsvd: 0.000 GB
HugePages_Surp: 0.000 GB
Hugepagesize: 0.002 GB
DirectMap4k: 0.314 GB
DirectMap2M: 6.523 GB
DirectMap1G: 763.000 GB
processor   : 0
vendor_id   : GenuineIntel
cpu family  : 6
model   : 62
model name  : Intel(R) Xeon(R) CPU E7-4890 v2 @ 2.80GHz
stepping: 7
microcode   : 0x715
cpu MHz : 1762.646
cache size  : 38400 KB
physical id : 0
siblings: 30
core id : 0
cpu cores   : 15
apicid  : 0
initial apicid  : 0
fpu : yes
fpu_exception   : yes
cpuid level : 13
wp  : yes
flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov 
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb 
rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology 
nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est 
tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt 
tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm epb intel_ppin ssbd ibrs 
ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt 
dtherm ida arat pln pts md_clear spec_ctrl intel_stibp flush_l1d
bogomips: 5586.07
clflush size: 64
cache_alignment : 64
address sizes   : 46 bits physical, 48 bits virtual
power management:NAME="Red Hat Enterprise Linux Server"
VERSION="7.8 (Maipo)"
ID="rhel"
ID_LIKE="fedora"
VARIANT="Server"
VARIANT_ID="server"
VERSION_ID="7.8"
PRETTY_NAME="Red Hat Enterprise Linux Server 7.8 (Maipo)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:redhat:enterprise_linux:7.8:GA:server"
HOME_URL="https://www.redhat.com/;
BUG_REPORT_URL="https://bugzilla.redhat.com/;

REDHAT_BUGZILLA_PRODUCT="Red Hat Enterprise Linux 7"
REDHAT_BUGZILLA_PRODUCT_VERSION=7.8
REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux"
REDHAT_SUPPORT_PRODUCT_VERSION="7.8"


Re: undetected deadlock in ALTER SUBSCRIPTION ... REFRESH PUBLICATION

2023-12-04 Thread Amit Kapila
On Mon, Dec 4, 2023 at 5:30 PM Tomas Vondra
 wrote:
>
> On 12/4/23 12:37, Amit Kapila wrote:
> > On Sat, Dec 2, 2023 at 9:52 PM Shlok Kyal  wrote:
> >>
> >>> thread. I think you can compare the timing of regression tests in
> >>> subscription, with and without the patch to show there is no
> >>> regression. And probably some tests with a large number of tables for
> >>> sync with very little data.
> >>
> >> I have tested the regression test timings for subscription with and
> >> without patch. I also did the timing test for sync of subscription
> >> with the publisher for 100 and 1000 tables respectively.
> >> I have attached the test script and results of the timing test are as 
> >> follows:
> >>
> >> Time taken for test to run in Linux VM
> >> Summary|  Subscription Test (sec)
> >> |100 tables in pub and Sub (sec)|  1000 tables in pub and Sub
> >> (sec)
> >> Without patch Release   |  95.564
> >>  | 7.877 |   58.919
> >> With patch Release|  96.513
> >>| 6.533 |   45.807
> >>
> >> Time Taken for test to run in another Linux VM
> >> Summary|  Subscription Test (sec)
> >> |100 tables in pub and Sub (sec)|  1000 tables in pub and Sub
> >> (sec)
> >> Without patch Release   |  109.8145
> >> |6.4675   |83.001
> >> With patch Release|  113.162
> >>|7.947  |   87.113
> >>
> >
> > So, on some machines, it may increase the test timing although not too
> > much. I think the reason is probably doing the work in multiple
> > transactions for multiple relations. I am wondering that instead of
> > committing and starting a new transaction before
> > wait_for_relation_state_change(), what if we do it inside that
> > function just before we decide to wait? It is quite possible that in
> > many cases we don't need any wait at all.
> >
>
> I'm not sure what you mean by "do it". What should the function do?
>

I mean to commit the open transaction at the below place in
wait_for_relation_state_change()

wait_for_relation_state_change()
{
...
-- commit the xact
WaitLatch();
...
}

Then start after the wait is over. This is just to test whether it
improves the difference in regression test timing.

> As for the test results, I very much doubt the differences are not
> caused simply by random timing variations, or something like that. And I
> don't understand what "Performance Machine Linux" is, considering those
> timings are slower than the other two machines.
>
> Also, even if it was a bit slower, does it really matter? I mean, the
> current code is wrong, can lead to infinite duration if it happens to
> hit the deadlock. And it's a one-time action, I don't think it's a very
> sensitive in terms of performance.
>

Yeah, I see that point but trying to evaluate if we can avoid an
increase in regression test timing at least for the HEAD. The tests
are done in release mode, so, I suspect there could be a slightly
bigger gap in debug mode (we can check that once though) which may hit
developers running regressions quite often in their development
environments. Now, if we there is no easy way to avoid the increase in
regression test timing, we have to still fix the problem, so we have
to take the hit of some increase in time.

-- 
With Regards,
Amit Kapila.




Re: undetected deadlock in ALTER SUBSCRIPTION ... REFRESH PUBLICATION

2023-12-04 Thread Tomas Vondra



On 12/4/23 12:37, Amit Kapila wrote:
> On Sat, Dec 2, 2023 at 9:52 PM Shlok Kyal  wrote:
>>
>>> thread. I think you can compare the timing of regression tests in
>>> subscription, with and without the patch to show there is no
>>> regression. And probably some tests with a large number of tables for
>>> sync with very little data.
>>
>> I have tested the regression test timings for subscription with and
>> without patch. I also did the timing test for sync of subscription
>> with the publisher for 100 and 1000 tables respectively.
>> I have attached the test script and results of the timing test are as 
>> follows:
>>
>> Time taken for test to run in Linux VM
>> Summary|  Subscription Test (sec)
>> |100 tables in pub and Sub (sec)|  1000 tables in pub and Sub
>> (sec)
>> Without patch Release   |  95.564
>>  | 7.877 |   58.919
>> With patch Release|  96.513
>>| 6.533 |   45.807
>>
>> Time Taken for test to run in another Linux VM
>> Summary|  Subscription Test (sec)
>> |100 tables in pub and Sub (sec)|  1000 tables in pub and Sub
>> (sec)
>> Without patch Release   |  109.8145
>> |6.4675   |83.001
>> With patch Release|  113.162
>>|7.947  |   87.113
>>
> 
> So, on some machines, it may increase the test timing although not too
> much. I think the reason is probably doing the work in multiple
> transactions for multiple relations. I am wondering that instead of
> committing and starting a new transaction before
> wait_for_relation_state_change(), what if we do it inside that
> function just before we decide to wait? It is quite possible that in
> many cases we don't need any wait at all.
> 

I'm not sure what you mean by "do it". What should the function do?

As for the test results, I very much doubt the differences are not
caused simply by random timing variations, or something like that. And I
don't understand what "Performance Machine Linux" is, considering those
timings are slower than the other two machines.

Also, even if it was a bit slower, does it really matter? I mean, the
current code is wrong, can lead to infinite duration if it happens to
hit the deadlock. And it's a one-time action, I don't think it's a very
sensitive in terms of performance.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: undetected deadlock in ALTER SUBSCRIPTION ... REFRESH PUBLICATION

2023-12-04 Thread Amit Kapila
On Sat, Dec 2, 2023 at 9:52 PM Shlok Kyal  wrote:
>
> > thread. I think you can compare the timing of regression tests in
> > subscription, with and without the patch to show there is no
> > regression. And probably some tests with a large number of tables for
> > sync with very little data.
>
> I have tested the regression test timings for subscription with and
> without patch. I also did the timing test for sync of subscription
> with the publisher for 100 and 1000 tables respectively.
> I have attached the test script and results of the timing test are as follows:
>
> Time taken for test to run in Linux VM
> Summary|  Subscription Test (sec)
> |100 tables in pub and Sub (sec)|  1000 tables in pub and Sub
> (sec)
> Without patch Release   |  95.564
>  | 7.877 |   58.919
> With patch Release|  96.513
>| 6.533 |   45.807
>
> Time Taken for test to run in another Linux VM
> Summary|  Subscription Test (sec)
> |100 tables in pub and Sub (sec)|  1000 tables in pub and Sub
> (sec)
> Without patch Release   |  109.8145
> |6.4675   |83.001
> With patch Release|  113.162
>|7.947  |   87.113
>

So, on some machines, it may increase the test timing although not too
much. I think the reason is probably doing the work in multiple
transactions for multiple relations. I am wondering that instead of
committing and starting a new transaction before
wait_for_relation_state_change(), what if we do it inside that
function just before we decide to wait? It is quite possible that in
many cases we don't need any wait at all.

-- 
With Regards,
Amit Kapila.




Re: undetected deadlock in ALTER SUBSCRIPTION ... REFRESH PUBLICATION

2023-12-02 Thread Shlok Kyal
Hi,

> thread. I think you can compare the timing of regression tests in
> subscription, with and without the patch to show there is no
> regression. And probably some tests with a large number of tables for
> sync with very little data.

I have tested the regression test timings for subscription with and
without patch. I also did the timing test for sync of subscription
with the publisher for 100 and 1000 tables respectively.
I have attached the test script and results of the timing test are as follows:

Time taken for test to run in Linux VM
Summary|  Subscription Test (sec)
|100 tables in pub and Sub (sec)|  1000 tables in pub and Sub
(sec)
Without patch Release   |  95.564
 | 7.877 |   58.919
With patch Release|  96.513
   | 6.533 |   45.807

Time Taken for test to run in another Linux VM
Summary|  Subscription Test (sec)
|100 tables in pub and Sub (sec)|  1000 tables in pub and Sub
(sec)
Without patch Release   |  109.8145
|6.4675   |83.001
With patch Release|  113.162
   |7.947  |   87.113

Time Taken for test to run in Performance Machine Linux
Summary|  Subscription Test (sec)
|100 tables in pub and Sub (sec)|  1000 tables in pub and Sub
(sec)
Without patch Release   |  115.871
 | 6.656 |   81.157
With patch Release|  115.922
   | 6.7305   |   81.1525

thoughts?

Thanks and Regards,
Shlok Kyal


034_tmp.pl
Description: Binary data


Re: undetected deadlock in ALTER SUBSCRIPTION ... REFRESH PUBLICATION

2023-11-25 Thread Amit Kapila
On Fri, Nov 24, 2023 at 5:05 PM Shlok Kyal  wrote:
>
> > I tried to reproduce the issue and was able to reproduce it with
> > scripts shared by Tomas.
> > I tried testing it from PG17 to PG 11. This issue is reproducible for
> > each version.
> >
> > Next I would try to test with the patch in the thread shared by Amit.
>
> I have created the v1 patch to resolve the issue. Have tested the
> patch on HEAD to PG12.
> The same patch applies to all the versions. The changes are similar to
> the one posted in the thread
> https://www.postgresql.org/message-id/1412708.1674417574%40sss.pgh.pa.us
>

(it's quite likely we hold lock on
+ * pg_replication_origin, which the sync worker will need
+ * to update).

This part of the comment is stale and doesn't hold true. You need to
update the reason based on the latest problem discovered in this
thread. I think you can compare the timing of regression tests in
subscription, with and without the patch to show there is no
regression. And probably some tests with a large number of tables for
sync with very little data.

-- 
With Regards,
Amit Kapila.




Re: undetected deadlock in ALTER SUBSCRIPTION ... REFRESH PUBLICATION

2023-11-24 Thread Shlok Kyal
Hi,

> I tried to reproduce the issue and was able to reproduce it with
> scripts shared by Tomas.
> I tried testing it from PG17 to PG 11. This issue is reproducible for
> each version.
>
> Next I would try to test with the patch in the thread shared by Amit.

I have created the v1 patch to resolve the issue. Have tested the
patch on HEAD to PG12.
The same patch applies to all the versions. The changes are similar to
the one posted in the thread
https://www.postgresql.org/message-id/1412708.1674417574%40sss.pgh.pa.us

Thanks and Regards,
Shlok Kyal


v1-0001-Deadlock-when-apply-worker-tablesync-worker-and-c.patch
Description: Binary data


Re: undetected deadlock in ALTER SUBSCRIPTION ... REFRESH PUBLICATION

2023-11-23 Thread Shlok Kyal
Hi,

I tried to reproduce the issue and was able to reproduce it with
scripts shared by Tomas.
I tried testing it from PG17 to PG 11. This issue is reproducible for
each version.

Next I would try to test with the patch in the thread shared by Amit.

Thanks,
Shlok Kumar Kyal




Re: undetected deadlock in ALTER SUBSCRIPTION ... REFRESH PUBLICATION

2023-11-23 Thread Tomas Vondra


On 11/23/23 10:24, Amit Kapila wrote:
> On Wed, Nov 22, 2023 at 4:51 PM Tomas Vondra
>  wrote:
>>
>> On 11/22/23 11:38, Amit Kapila wrote:
>>>
>>> Okay. IIUC, what's going on here is that the apply worker acquires
>>> AccessShareLock on pg_subscription to update rel state for one of the
>>> tables say tbl-1, and then for another table say tbl-2, it started
>>> waiting for a state change via wait_for_relation_state_change(). I
>>> think here the fix is to commit the transaction before we go for a
>>> wait. I guess we need something along the lines of what is proposed in
>>> [1] though we have solved the problem in that thread in some other
>>> way..
>>>
>>
>> Possibly. I haven't checked if the commit might have some unexpected
>> consequences, but I can confirm I can no longer reproduce the deadlock
>> with the patch applied.
>>
> 
> Thanks for the verification. Offhand, I don't see any problem with
> doing a commit at that place but will try to think some more about it.
> I think we may want to call pgstat_report_stat(false) after commit to
> avoid a long delay in stats.
> 

Makes sense.

> I haven't verified but I think this will be a problem in
> back-branches as well.
> 

Yes. I haven't tried but I don't see why backbranches wouldn't have the
same issue.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: undetected deadlock in ALTER SUBSCRIPTION ... REFRESH PUBLICATION

2023-11-23 Thread Amit Kapila
On Wed, Nov 22, 2023 at 4:51 PM Tomas Vondra
 wrote:
>
> On 11/22/23 11:38, Amit Kapila wrote:
> >
> > Okay. IIUC, what's going on here is that the apply worker acquires
> > AccessShareLock on pg_subscription to update rel state for one of the
> > tables say tbl-1, and then for another table say tbl-2, it started
> > waiting for a state change via wait_for_relation_state_change(). I
> > think here the fix is to commit the transaction before we go for a
> > wait. I guess we need something along the lines of what is proposed in
> > [1] though we have solved the problem in that thread in some other
> > way..
> >
>
> Possibly. I haven't checked if the commit might have some unexpected
> consequences, but I can confirm I can no longer reproduce the deadlock
> with the patch applied.
>

Thanks for the verification. Offhand, I don't see any problem with
doing a commit at that place but will try to think some more about it.
I think we may want to call pgstat_report_stat(false) after commit to
avoid a long delay in stats.

I haven't verified but I think this will be a problem in back-branches as well.

-- 
With Regards,
Amit Kapila.




Re: undetected deadlock in ALTER SUBSCRIPTION ... REFRESH PUBLICATION

2023-11-22 Thread Tomas Vondra
On 11/22/23 11:38, Amit Kapila wrote:
> On Tue, Nov 21, 2023 at 6:56 PM Tomas Vondra
>  wrote:
>>
>> On 11/21/23 14:16, Amit Kapila wrote:
>>> On Tue, Nov 21, 2023 at 5:17 PM Tomas Vondra
>>>  wrote:

>>>
>>> It seems there is some inconsistency in what you have written for
>>> client backends/tablesync worker vs. apply worker. The above text
>>> seems to be saying that the client backend and table sync worker are
>>> waiting on a "subscription row in pg_subscription" and the apply
>>> worker is operating on "pg_subscription_rel". So, if that is true then
>>> they shouldn't get stuck.
>>>
>>> I think here client backend and tablesync worker seems to be blocked
>>> for a lock on pg_subscription_rel.
>>>
>>
>> Not really, they are all locking the subscription. All the locks are on
>> classid=6100, which is pg_subscription:
>>
>>   test=# select 6100::regclass;
>>   regclass
>>   -
>>pg_subscription
>>   (1 row)
>>
>> The thing is, the tablesync workers call UpdateSubscriptionRelState,
>> which locks the pg_subscription catalog at the very beginning:
>>
>>LockSharedObject(SubscriptionRelationId, ...);
>>
>> So that's the issue. I haven't explored why it's done this way, and
>> there's no comment explaining locking the subscriptions is needed ...
>>
> 
> I think it prevents concurrent drop of rel during the REFRESH operation.
> 

Yes. Or maybe some other concurrent DDL on the relations included in the
subscription.

 The tablesync workers can't proceed because their lock request is stuck
 behind the AccessExclusiveLock request.

 And the apply worker can't proceed, because it's waiting for status
 update from the tablesync workers.

>>>
>>> This part is not clear to me because
>>> wait_for_relation_state_change()->GetSubscriptionRelState() seems to
>>> be releasing the lock while closing the relation. Am, I missing
>>> something?
>>>
>>
>> I think you're missing the fact that GetSubscriptionRelState() acquires
>> and releases the lock on pg_subscription_rel, but that's not the lock
>> causing the issue. The problem is the lock on the pg_subscription row.
>>
> 
> Okay. IIUC, what's going on here is that the apply worker acquires
> AccessShareLock on pg_subscription to update rel state for one of the
> tables say tbl-1, and then for another table say tbl-2, it started
> waiting for a state change via wait_for_relation_state_change(). I
> think here the fix is to commit the transaction before we go for a
> wait. I guess we need something along the lines of what is proposed in
> [1] though we have solved the problem in that thread in some other
> way..
> 

Possibly. I haven't checked if the commit might have some unexpected
consequences, but I can confirm I can no longer reproduce the deadlock
with the patch applied.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: undetected deadlock in ALTER SUBSCRIPTION ... REFRESH PUBLICATION

2023-11-22 Thread Amit Kapila
On Tue, Nov 21, 2023 at 6:56 PM Tomas Vondra
 wrote:
>
> On 11/21/23 14:16, Amit Kapila wrote:
> > On Tue, Nov 21, 2023 at 5:17 PM Tomas Vondra
> >  wrote:
> >>
> >
> > It seems there is some inconsistency in what you have written for
> > client backends/tablesync worker vs. apply worker. The above text
> > seems to be saying that the client backend and table sync worker are
> > waiting on a "subscription row in pg_subscription" and the apply
> > worker is operating on "pg_subscription_rel". So, if that is true then
> > they shouldn't get stuck.
> >
> > I think here client backend and tablesync worker seems to be blocked
> > for a lock on pg_subscription_rel.
> >
>
> Not really, they are all locking the subscription. All the locks are on
> classid=6100, which is pg_subscription:
>
>   test=# select 6100::regclass;
>   regclass
>   -
>pg_subscription
>   (1 row)
>
> The thing is, the tablesync workers call UpdateSubscriptionRelState,
> which locks the pg_subscription catalog at the very beginning:
>
>LockSharedObject(SubscriptionRelationId, ...);
>
> So that's the issue. I haven't explored why it's done this way, and
> there's no comment explaining locking the subscriptions is needed ...
>

I think it prevents concurrent drop of rel during the REFRESH operation.

> >> The tablesync workers can't proceed because their lock request is stuck
> >> behind the AccessExclusiveLock request.
> >>
> >> And the apply worker can't proceed, because it's waiting for status
> >> update from the tablesync workers.
> >>
> >
> > This part is not clear to me because
> > wait_for_relation_state_change()->GetSubscriptionRelState() seems to
> > be releasing the lock while closing the relation. Am, I missing
> > something?
> >
>
> I think you're missing the fact that GetSubscriptionRelState() acquires
> and releases the lock on pg_subscription_rel, but that's not the lock
> causing the issue. The problem is the lock on the pg_subscription row.
>

Okay. IIUC, what's going on here is that the apply worker acquires
AccessShareLock on pg_subscription to update rel state for one of the
tables say tbl-1, and then for another table say tbl-2, it started
waiting for a state change via wait_for_relation_state_change(). I
think here the fix is to commit the transaction before we go for a
wait. I guess we need something along the lines of what is proposed in
[1] though we have solved the problem in that thread in some other
way..

[1] - https://www.postgresql.org/message-id/1412708.1674417574%40sss.pgh.pa.us

-- 
With Regards,
Amit Kapila.




Re: undetected deadlock in ALTER SUBSCRIPTION ... REFRESH PUBLICATION

2023-11-21 Thread Tomas Vondra



On 11/21/23 14:16, Amit Kapila wrote:
> On Tue, Nov 21, 2023 at 5:17 PM Tomas Vondra
>  wrote:
>>
>> I decided to do some stress-testing of the built-in logical replication,
>> as part of the sequence decoding work. And I soon ran into an undetected
>> deadlock related to ALTER SUBSCRIPTION ... REFRESH PUBLICATION :-(
>>
>> The attached bash scripts triggers that in a couple seconds for me. The
>> script looks complicated, but most of the code is waiting for sync to
>> complete, catchup, and that sort of thing.
>>
>> What the script does is pretty simple:
>>
>> 1) initialize two clusters, set them as publisher/subscriber pair
>>
>> 2) create some number of tables, add them to publication and wait for
>>the sync to complete
>>
>> 3) start two pgbench runs in the background, modifying the publication
>>(one removes+adds all tables in a single transaction, one does that
>> with transaction per table)
>>
>> 4) run refresh.sh which does ALTER PUBLICATION ... REFRESH PUBLICATION
>>in a loop (now that I think about it, could be another pgbench
>>script, but well ...)
>>
>> 5) some consistency checks, but the lockup happens earlier so this does
>>not really matter
>>
>> After a small number of refresh cycles (for me it's usually a couple
>> dozen), we end up with a couple stuck locks (I shortened the backend
>> type string a bit, for formatting reasons):
>>
>>   test=# select a.pid, classid, objid, backend_type, query
>>from pg_locks l join pg_stat_activity a on (a.pid = l.pid)
>>   where not granted;
>>
>>  pid   | classid | objid | backend_type | query
>>   -+-+---+--+--
>>2691941 |6100 | 16785 | client backend   | ALTER SUBSCRIPTION s
>>   REFRESH PUBLICATION
>>2691837 |6100 | 16785 | tablesync worker |
>>2691936 |6100 | 16785 | tablesync worker |
>>   (3 rows)
>>
>> All these backends wait for 6100/16785, which is the subscription row in
>> pg_subscription. The tablesync workers are requesting AccessShareLock,
>> the client backend however asks for AccessExclusiveLock.
>>
>> The entry is currently locked by:
>>
>>   test=# select a.pid, mode, backend_type from pg_locks l
>>join pg_stat_activity a on (a.pid = l.pid)
>>   where classid=6100 and objid=16785 and granted;
>>
>>  pid   |  mode   |   backend_type
>>   -+-+--
>>2690477 | AccessShareLock | logical replication apply worker
>>   (1 row)
>>
>> But the apply worker is not waiting for any locks, so what's going on?
>>
>> Well, the problem is the apply worker is waiting for notification from
>> the tablesync workers the relation is synced, which happens through
>> updating the pg_subscription_rel row. And that wait happens in
>> wait_for_relation_state_change, which simply checks the row in a loop,
>> with a sleep by WaitLatch().
>>
>> Unfortunately, the tablesync workers can't update the row because the
>> client backend executing ALTER SUBSCRIPTION ... REFRESH PUBLICATION
>> sneaked in, and waits for an AccessExclusiveLock. So the tablesync
>> workers are stuck in the queue and can't proceed.
>>
>> The client backend can't proceed, because it's waiting for a lock held
>> by the apply worker.
>>
> 
> It seems there is some inconsistency in what you have written for
> client backends/tablesync worker vs. apply worker. The above text
> seems to be saying that the client backend and table sync worker are
> waiting on a "subscription row in pg_subscription" and the apply
> worker is operating on "pg_subscription_rel". So, if that is true then
> they shouldn't get stuck.
> 
> I think here client backend and tablesync worker seems to be blocked
> for a lock on pg_subscription_rel.
> 

Not really, they are all locking the subscription. All the locks are on
classid=6100, which is pg_subscription:

  test=# select 6100::regclass;
  regclass
  -
   pg_subscription
  (1 row)

The thing is, the tablesync workers call UpdateSubscriptionRelState,
which locks the pg_subscription catalog at the very beginning:

   LockSharedObject(SubscriptionRelationId, ...);

So that's the issue. I haven't explored why it's done this way, and
there's no comment explaining locking the subscriptions is needed ...

>> The tablesync workers can't proceed because their lock request is 

Re: undetected deadlock in ALTER SUBSCRIPTION ... REFRESH PUBLICATION

2023-11-21 Thread Amit Kapila
On Tue, Nov 21, 2023 at 5:17 PM Tomas Vondra
 wrote:
>
> I decided to do some stress-testing of the built-in logical replication,
> as part of the sequence decoding work. And I soon ran into an undetected
> deadlock related to ALTER SUBSCRIPTION ... REFRESH PUBLICATION :-(
>
> The attached bash scripts triggers that in a couple seconds for me. The
> script looks complicated, but most of the code is waiting for sync to
> complete, catchup, and that sort of thing.
>
> What the script does is pretty simple:
>
> 1) initialize two clusters, set them as publisher/subscriber pair
>
> 2) create some number of tables, add them to publication and wait for
>the sync to complete
>
> 3) start two pgbench runs in the background, modifying the publication
>(one removes+adds all tables in a single transaction, one does that
> with transaction per table)
>
> 4) run refresh.sh which does ALTER PUBLICATION ... REFRESH PUBLICATION
>in a loop (now that I think about it, could be another pgbench
>script, but well ...)
>
> 5) some consistency checks, but the lockup happens earlier so this does
>not really matter
>
> After a small number of refresh cycles (for me it's usually a couple
> dozen), we end up with a couple stuck locks (I shortened the backend
> type string a bit, for formatting reasons):
>
>   test=# select a.pid, classid, objid, backend_type, query
>from pg_locks l join pg_stat_activity a on (a.pid = l.pid)
>   where not granted;
>
>  pid   | classid | objid | backend_type | query
>   -+-+---+--+--
>2691941 |6100 | 16785 | client backend   | ALTER SUBSCRIPTION s
>   REFRESH PUBLICATION
>2691837 |6100 | 16785 | tablesync worker |
>2691936 |6100 | 16785 | tablesync worker |
>   (3 rows)
>
> All these backends wait for 6100/16785, which is the subscription row in
> pg_subscription. The tablesync workers are requesting AccessShareLock,
> the client backend however asks for AccessExclusiveLock.
>
> The entry is currently locked by:
>
>   test=# select a.pid, mode, backend_type from pg_locks l
>join pg_stat_activity a on (a.pid = l.pid)
>   where classid=6100 and objid=16785 and granted;
>
>  pid   |  mode   |   backend_type
>   -+-+--
>2690477 | AccessShareLock | logical replication apply worker
>   (1 row)
>
> But the apply worker is not waiting for any locks, so what's going on?
>
> Well, the problem is the apply worker is waiting for notification from
> the tablesync workers the relation is synced, which happens through
> updating the pg_subscription_rel row. And that wait happens in
> wait_for_relation_state_change, which simply checks the row in a loop,
> with a sleep by WaitLatch().
>
> Unfortunately, the tablesync workers can't update the row because the
> client backend executing ALTER SUBSCRIPTION ... REFRESH PUBLICATION
> sneaked in, and waits for an AccessExclusiveLock. So the tablesync
> workers are stuck in the queue and can't proceed.
>
> The client backend can't proceed, because it's waiting for a lock held
> by the apply worker.
>

It seems there is some inconsistency in what you have written for
client backends/tablesync worker vs. apply worker. The above text
seems to be saying that the client backend and table sync worker are
waiting on a "subscription row in pg_subscription" and the apply
worker is operating on "pg_subscription_rel". So, if that is true then
they shouldn't get stuck.

I think here client backend and tablesync worker seems to be blocked
for a lock on pg_subscription_rel.

> The tablesync workers can't proceed because their lock request is stuck
> behind the AccessExclusiveLock request.
>
> And the apply worker can't proceed, because it's waiting for status
> update from the tablesync workers.
>

This part is not clear to me because
wait_for_relation_state_change()->GetSubscriptionRelState() seems to
be releasing the lock while closing the relation. Am, I missing
something?

-- 
With Regards,
Amit Kapila.




undetected deadlock in ALTER SUBSCRIPTION ... REFRESH PUBLICATION

2023-11-21 Thread Tomas Vondra
Hi,

I decided to do some stress-testing of the built-in logical replication,
as part of the sequence decoding work. And I soon ran into an undetected
deadlock related to ALTER SUBSCRIPTION ... REFRESH PUBLICATION :-(

The attached bash scripts triggers that in a couple seconds for me. The
script looks complicated, but most of the code is waiting for sync to
complete, catchup, and that sort of thing.

What the script does is pretty simple:

1) initialize two clusters, set them as publisher/subscriber pair

2) create some number of tables, add them to publication and wait for
   the sync to complete

3) start two pgbench runs in the background, modifying the publication
   (one removes+adds all tables in a single transaction, one does that
with transaction per table)

4) run refresh.sh which does ALTER PUBLICATION ... REFRESH PUBLICATION
   in a loop (now that I think about it, could be another pgbench
   script, but well ...)

5) some consistency checks, but the lockup happens earlier so this does
   not really matter

After a small number of refresh cycles (for me it's usually a couple
dozen), we end up with a couple stuck locks (I shortened the backend
type string a bit, for formatting reasons):

  test=# select a.pid, classid, objid, backend_type, query
   from pg_locks l join pg_stat_activity a on (a.pid = l.pid)
  where not granted;

 pid   | classid | objid | backend_type | query
  -+-+---+--+--
   2691941 |6100 | 16785 | client backend   | ALTER SUBSCRIPTION s
  REFRESH PUBLICATION
   2691837 |6100 | 16785 | tablesync worker |
   2691936 |6100 | 16785 | tablesync worker |
  (3 rows)

All these backends wait for 6100/16785, which is the subscription row in
pg_subscription. The tablesync workers are requesting AccessShareLock,
the client backend however asks for AccessExclusiveLock.

The entry is currently locked by:

  test=# select a.pid, mode, backend_type from pg_locks l
   join pg_stat_activity a on (a.pid = l.pid)
  where classid=6100 and objid=16785 and granted;

 pid   |  mode   |   backend_type
  -+-+--
   2690477 | AccessShareLock | logical replication apply worker
  (1 row)

But the apply worker is not waiting for any locks, so what's going on?

Well, the problem is the apply worker is waiting for notification from
the tablesync workers the relation is synced, which happens through
updating the pg_subscription_rel row. And that wait happens in
wait_for_relation_state_change, which simply checks the row in a loop,
with a sleep by WaitLatch().

Unfortunately, the tablesync workers can't update the row because the
client backend executing ALTER SUBSCRIPTION ... REFRESH PUBLICATION
sneaked in, and waits for an AccessExclusiveLock. So the tablesync
workers are stuck in the queue and can't proceed.

The client backend can't proceed, because it's waiting for a lock held
by the apply worker.

The tablesync workers can't proceed because their lock request is stuck
behind the AccessExclusiveLock request.

And the apply worker can't proceed, because it's waiting for status
update from the tablesync workers.

And the deadlock is undetected, because the apply worker is not waiting
on a lock, but sleeping on a latch :-(


I don't know what's the right solution here. I wonder if the apply
worker might release the lock before waiting for the update, that'd
solve this whole issue.

Alternatively, ALTER PUBLICATION might wait for the lock only for a
limited amount of time, and try again (but then it'd be susceptible to
starving, of course).

Or maybe there's a way to make this work in a way that would be visible
to the deadlock detector? That'd mean we occasionally get processes
killed to resolve a deadlock, but that's still better than processes
stuck indefinitely ...


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

lockup-test.sh
Description: application/shellscript


refresh.sh
Description: application/shellscript