Re: Reset snapshot export state on the transaction abort

2021-10-18 Thread Dilip Kumar
On Mon, Oct 18, 2021 at 8:51 AM Michael Paquier  wrote:
>
> On Sun, Oct 17, 2021 at 09:33:48AM +0900, Michael Paquier wrote:
> > That seems logically fine.  I'll check that tomorrow.
>
> And that looks indeed fine.  I have adjusted a couple of things, and
> backpatched the fix.

Thanks!


-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com




Re: Reset snapshot export state on the transaction abort

2021-10-17 Thread Michael Paquier
On Sun, Oct 17, 2021 at 09:33:48AM +0900, Michael Paquier wrote:
> That seems logically fine.  I'll check that tomorrow.

And that looks indeed fine.  I have adjusted a couple of things, and
backpatched the fix.
--
Michael


signature.asc
Description: PGP signature


Re: Reset snapshot export state on the transaction abort

2021-10-16 Thread Michael Paquier
On Sat, Oct 16, 2021 at 08:31:36AM -0700, Zhihong Yu wrote:
> On Sat, Oct 16, 2021 at 3:10 AM Dilip Kumar  wrote:
>> On Sat, Oct 16, 2021 at 9:13 AM Michael Paquier 
>> wrote:
>>> One solution would be as simple as saving
>>> SavedResourceOwnerDuringExport into a temporary variable before
>>> calling AbortCurrentTransaction(), and save it back into
>>> CurrentResourceOwner once we are done in
>>> SnapBuildClearExportedSnapshot() as we need to rely on
>>> AbortTransaction() to do the static state cleanup if an error happens
>>> until the command after the replslot creation command shows up.
>>
>> Yeah, this idea looks fine to me.  I have modified the patch.  In
>> addition to that I have removed calling
>> ResetSnapBuildExportSnapshotState from the
>> SnapBuildClearExportedSnapshot because that is anyway being called
>> from the AbortTransaction.

That seems logically fine.  I'll check that tomorrow.

> +extern void ResetSnapBuildExportSnapshotState(void);
> 
> ResetSnapBuildExportSnapshotState() is only called inside snapbuild.c
> I wonder if the addition to snapbuild.h is needed.

As of xact.c in v2 of the patch, we have that:
@@ -2698,6 +2699,9 @@ AbortTransaction(void)
/* Reset logical streaming state. */
ResetLogicalStreamingState();

+   /* Reset snapshot export state. */
+   ResetSnapBuildExportSnapshotState();
--
Michael


signature.asc
Description: PGP signature


Re: Reset snapshot export state on the transaction abort

2021-10-16 Thread Zhihong Yu
On Sat, Oct 16, 2021 at 3:10 AM Dilip Kumar  wrote:

> On Sat, Oct 16, 2021 at 9:13 AM Michael Paquier 
> wrote:
> >
> > While double-checking this stuff, I have noticed something that's
> > wrong in the patch when a command that follows a
> > CREATE_REPLICATION_SLOT query resets SnapBuildClearExportedSnapshot().
> > Once the slot is created, the WAL sender is in a TRANS_INPROGRESS
> > state, meaning that AbortCurrentTransaction() would call
> > AbortTransaction(), hence calling ResetSnapBuildExportSnapshotState()
> > and resetting SavedResourceOwnerDuringExport to NULL before we store a
> > NULL into CurrentResourceOwner :)
>
> Right, good catch!
>
> > One solution would be as simple as saving
> > SavedResourceOwnerDuringExport into a temporary variable before
> > calling AbortCurrentTransaction(), and save it back into
> > CurrentResourceOwner once we are done in
> > SnapBuildClearExportedSnapshot() as we need to rely on
> > AbortTransaction() to do the static state cleanup if an error happens
> > until the command after the replslot creation command shows up.
>
> Yeah, this idea looks fine to me.  I have modified the patch.  In
> addition to that I have removed calling
> ResetSnapBuildExportSnapshotState from the
> SnapBuildClearExportedSnapshot because that is anyway being called
> from the AbortTransaction.
>
>
>
> --
> Regards,
> Dilip Kumar
> EnterpriseDB: http://www.enterprisedb.com

Hi,

bq. While exporting a snapshot we set a temporary states which get

 a temporary states -> temporary states

+extern void ResetSnapBuildExportSnapshotState(void);

ResetSnapBuildExportSnapshotState() is only called inside snapbuild.c
I wonder if the addition to snapbuild.h is needed.

Cheers


Re: Reset snapshot export state on the transaction abort

2021-10-16 Thread Dilip Kumar
On Sat, Oct 16, 2021 at 9:13 AM Michael Paquier  wrote:
>
> While double-checking this stuff, I have noticed something that's
> wrong in the patch when a command that follows a
> CREATE_REPLICATION_SLOT query resets SnapBuildClearExportedSnapshot().
> Once the slot is created, the WAL sender is in a TRANS_INPROGRESS
> state, meaning that AbortCurrentTransaction() would call
> AbortTransaction(), hence calling ResetSnapBuildExportSnapshotState()
> and resetting SavedResourceOwnerDuringExport to NULL before we store a
> NULL into CurrentResourceOwner :)

Right, good catch!

> One solution would be as simple as saving
> SavedResourceOwnerDuringExport into a temporary variable before
> calling AbortCurrentTransaction(), and save it back into
> CurrentResourceOwner once we are done in
> SnapBuildClearExportedSnapshot() as we need to rely on
> AbortTransaction() to do the static state cleanup if an error happens
> until the command after the replslot creation command shows up.

Yeah, this idea looks fine to me.  I have modified the patch.  In
addition to that I have removed calling
ResetSnapBuildExportSnapshotState from the
SnapBuildClearExportedSnapshot because that is anyway being called
from the AbortTransaction.



-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
From 062a84dcd821fa5e41998c8714f8ec16befcf983 Mon Sep 17 00:00:00 2001
From: Dilip Kumar 
Date: Mon, 11 Oct 2021 20:38:58 +0530
Subject: [PATCH v2] Reset snapshot export state during abort

While exporting a snapshot we set a temporary states which get
cleaned up only while executing the next replication command.
But if the snapshot exporting transaction gets aborted then those
states are not cleaned.  This patch fix this by cleaning it up
during AbortTransaction.
---
 src/backend/access/transam/xact.c   |  4 
 src/backend/replication/logical/snapbuild.c | 18 +-
 src/include/replication/snapbuild.h |  1 +
 3 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index 4cc38f0..d3f7440 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -46,6 +46,7 @@
 #include "replication/logical.h"
 #include "replication/logicallauncher.h"
 #include "replication/origin.h"
+#include "replication/snapbuild.h"
 #include "replication/syncrep.h"
 #include "replication/walsender.h"
 #include "storage/condition_variable.h"
@@ -2698,6 +2699,9 @@ AbortTransaction(void)
 	/* Reset logical streaming state. */
 	ResetLogicalStreamingState();
 
+	/* Reset snapshot export state. */
+	ResetSnapBuildExportSnapshotState();
+
 	/* If in parallel mode, clean up workers and exit parallel mode. */
 	if (IsInParallelMode())
 	{
diff --git a/src/backend/replication/logical/snapbuild.c b/src/backend/replication/logical/snapbuild.c
index a54..d889c22 100644
--- a/src/backend/replication/logical/snapbuild.c
+++ b/src/backend/replication/logical/snapbuild.c
@@ -682,6 +682,8 @@ SnapBuildGetOrBuildSnapshot(SnapBuild *builder, TransactionId xid)
 void
 SnapBuildClearExportedSnapshot(void)
 {
+	ResourceOwner	tmpResOwner;
+
 	/* nothing exported, that is the usual case */
 	if (!ExportInProgress)
 		return;
@@ -689,10 +691,24 @@ SnapBuildClearExportedSnapshot(void)
 	if (!IsTransactionState())
 		elog(ERROR, "clearing exported snapshot in wrong transaction state");
 
+	/*
+	 * Abort transaction will reset the SavedResourceOwnerDuringExport so
+	 * remember this in a local variable.
+	 */
+	tmpResOwner = SavedResourceOwnerDuringExport;
+
 	/* make sure nothing  could have ever happened */
 	AbortCurrentTransaction();
 
-	CurrentResourceOwner = SavedResourceOwnerDuringExport;
+	CurrentResourceOwner = tmpResOwner;
+}
+
+/*
+ * Reset snapshot export state on the transaction abort.
+ */
+void
+ResetSnapBuildExportSnapshotState(void)
+{
 	SavedResourceOwnerDuringExport = NULL;
 	ExportInProgress = false;
 }
diff --git a/src/include/replication/snapbuild.h b/src/include/replication/snapbuild.h
index de72124..6a1082b 100644
--- a/src/include/replication/snapbuild.h
+++ b/src/include/replication/snapbuild.h
@@ -70,6 +70,7 @@ extern void SnapBuildSnapDecRefcount(Snapshot snap);
 extern Snapshot SnapBuildInitialSnapshot(SnapBuild *builder);
 extern const char *SnapBuildExportSnapshot(SnapBuild *snapstate);
 extern void SnapBuildClearExportedSnapshot(void);
+extern void ResetSnapBuildExportSnapshotState(void);
 
 extern SnapBuildState SnapBuildCurrentState(SnapBuild *snapstate);
 extern Snapshot SnapBuildGetOrBuildSnapshot(SnapBuild *builder,
-- 
1.8.3.1



Re: Reset snapshot export state on the transaction abort

2021-10-15 Thread Michael Paquier
On Thu, Oct 14, 2021 at 02:58:55PM +0530, Dilip Kumar wrote:
> On Thu, Oct 14, 2021 at 12:24 PM Michael Paquier  wrote:
>> Yes, you are right here.  I did not remember the semantics this relies
>> on.  I have played more with the patch, reviewed the whole, and the
>> fields you are resetting as part of the snapshot builds seem correct
>> to me.  So let's fix this.
> 
> Great, thanks!

While double-checking this stuff, I have noticed something that's
wrong in the patch when a command that follows a
CREATE_REPLICATION_SLOT query resets SnapBuildClearExportedSnapshot().
Once the slot is created, the WAL sender is in a TRANS_INPROGRESS
state, meaning that AbortCurrentTransaction() would call
AbortTransaction(), hence calling ResetSnapBuildExportSnapshotState()
and resetting SavedResourceOwnerDuringExport to NULL before we store a
NULL into CurrentResourceOwner :)

One solution would be as simple as saving
SavedResourceOwnerDuringExport into a temporary variable before
calling AbortCurrentTransaction(), and save it back into
CurrentResourceOwner once we are done in
SnapBuildClearExportedSnapshot() as we need to rely on
AbortTransaction() to do the static state cleanup if an error happens
until the command after the replslot creation command shows up.
--
Michael


signature.asc
Description: PGP signature


Re: Reset snapshot export state on the transaction abort

2021-10-14 Thread Dilip Kumar
On Thu, Oct 14, 2021 at 12:24 PM Michael Paquier  wrote:
>
> On Wed, Oct 13, 2021 at 10:53:24AM +0530, Dilip Kumar wrote:
> > Actually, it is not required because 1) Snapshot export can not be
> > allowed within a transaction block, basically, it starts its own
> > transaction block and aborts that while executing any next replication
> > command see SnapBuildClearExportedSnapshot().  So our problem is only
> > if the transaction block internally started for exporting, gets
> > aborted before any next command arrives.  So there is no possibility
> > of starting any sub transaction.
>
> Yes, you are right here.  I did not remember the semantics this relies
> on.  I have played more with the patch, reviewed the whole, and the
> fields you are resetting as part of the snapshot builds seem correct
> to me.  So let's fix this.

Great, thanks!

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com




Re: Reset snapshot export state on the transaction abort

2021-10-14 Thread Michael Paquier
On Wed, Oct 13, 2021 at 10:53:24AM +0530, Dilip Kumar wrote:
> Actually, it is not required because 1) Snapshot export can not be
> allowed within a transaction block, basically, it starts its own
> transaction block and aborts that while executing any next replication
> command see SnapBuildClearExportedSnapshot().  So our problem is only
> if the transaction block internally started for exporting, gets
> aborted before any next command arrives.  So there is no possibility
> of starting any sub transaction.

Yes, you are right here.  I did not remember the semantics this relies
on.  I have played more with the patch, reviewed the whole, and the
fields you are resetting as part of the snapshot builds seem correct
to me.  So let's fix this.
--
Michael


signature.asc
Description: PGP signature


Re: Reset snapshot export state on the transaction abort

2021-10-12 Thread Dilip Kumar
On Wed, Oct 13, 2021 at 10:17 AM Michael Paquier  wrote:
>
> On Mon, Oct 11, 2021 at 08:46:32PM +0530, Dilip Kumar wrote:
> > As reported at [1], if the transaction is aborted during export
> > snapshot then ExportInProgress and SavedResourceOwnerDuringExport are
> > not getting reset and that is throwing an error
> > "clearing exported snapshot in wrong transaction state" while
> > executing the next command.  The attached patch clears this state if
> > the transaction is aborted.

Correct.

> Injecting an error is enough to reproduce the failure in a second
> command after the first one failed.  This could happen on OOM for the
> palloc() done at the beginning of SnapBuildInitialSnapshot().
>
> @@ -2698,6 +2698,9 @@ AbortTransaction(void)
> /* Reset logical streaming state. */
> ResetLogicalStreamingState();
>
> +   /* Reset snapshot export state. */
> +   ResetSnapBuildExportSnapshotState();
> Shouldn't we care about the case of a sub-transaction abort as well?
> See AbortSubTransaction().


Actually, it is not required because 1) Snapshot export can not be
allowed within a transaction block, basically, it starts its own
transaction block and aborts that while executing any next replication
command see SnapBuildClearExportedSnapshot().  So our problem is only
if the transaction block internally started for exporting, gets
aborted before any next command arrives.  So there is no possibility
of starting any sub transaction.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com




Re: Reset snapshot export state on the transaction abort

2021-10-12 Thread Michael Paquier
On Mon, Oct 11, 2021 at 08:46:32PM +0530, Dilip Kumar wrote:
> As reported at [1], if the transaction is aborted during export
> snapshot then ExportInProgress and SavedResourceOwnerDuringExport are
> not getting reset and that is throwing an error
> "clearing exported snapshot in wrong transaction state" while
> executing the next command.  The attached patch clears this state if
> the transaction is aborted.

Injecting an error is enough to reproduce the failure in a second
command after the first one failed.  This could happen on OOM for the
palloc() done at the beginning of SnapBuildInitialSnapshot().

@@ -2698,6 +2698,9 @@ AbortTransaction(void)
/* Reset logical streaming state. */
ResetLogicalStreamingState();

+   /* Reset snapshot export state. */
+   ResetSnapBuildExportSnapshotState();
Shouldn't we care about the case of a sub-transaction abort as well?
See AbortSubTransaction().
--
Michael


signature.asc
Description: PGP signature


Reset snapshot export state on the transaction abort

2021-10-11 Thread Dilip Kumar
As reported at [1], if the transaction is aborted during export
snapshot then ExportInProgress and SavedResourceOwnerDuringExport are
not getting reset and that is throwing an error
"clearing exported snapshot in wrong transaction state" while
executing the next command.  The attached patch clears this state if
the transaction is aborted.

[1] 
https://www.postgresql.org/message-id/CAFiTN-tqopqpfS6HHug2nnOGieJJ_nm-Nvy0WBZ=zpo-lqt...@mail.gmail.com

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
From 673bea5121f0a6acadb1304e4a2b1027f5914e72 Mon Sep 17 00:00:00 2001
From: Dilip Kumar 
Date: Mon, 11 Oct 2021 20:38:58 +0530
Subject: [PATCH v1] Reset snapshot export state during abort

---
 src/backend/access/transam/xact.c   | 3 +++
 src/backend/replication/logical/snapbuild.c | 9 +
 src/include/replication/snapbuild.h | 1 +
 3 files changed, 13 insertions(+)

diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index 4cc38f0..60be6bb 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -2698,6 +2698,9 @@ AbortTransaction(void)
 	/* Reset logical streaming state. */
 	ResetLogicalStreamingState();
 
+	/* Reset snapshot export state. */
+	ResetSnapBuildExportSnapshotState();
+
 	/* If in parallel mode, clean up workers and exit parallel mode. */
 	if (IsInParallelMode())
 	{
diff --git a/src/backend/replication/logical/snapbuild.c b/src/backend/replication/logical/snapbuild.c
index a54..5127ea5 100644
--- a/src/backend/replication/logical/snapbuild.c
+++ b/src/backend/replication/logical/snapbuild.c
@@ -693,6 +693,15 @@ SnapBuildClearExportedSnapshot(void)
 	AbortCurrentTransaction();
 
 	CurrentResourceOwner = SavedResourceOwnerDuringExport;
+	ResetSnapBuildExportSnapshotState();
+}
+
+/*
+ * Reset snapshot export state on the transaction abort.
+ */
+void
+ResetSnapBuildExportSnapshotState(void)
+{
 	SavedResourceOwnerDuringExport = NULL;
 	ExportInProgress = false;
 }
diff --git a/src/include/replication/snapbuild.h b/src/include/replication/snapbuild.h
index de72124..6a1082b 100644
--- a/src/include/replication/snapbuild.h
+++ b/src/include/replication/snapbuild.h
@@ -70,6 +70,7 @@ extern void SnapBuildSnapDecRefcount(Snapshot snap);
 extern Snapshot SnapBuildInitialSnapshot(SnapBuild *builder);
 extern const char *SnapBuildExportSnapshot(SnapBuild *snapstate);
 extern void SnapBuildClearExportedSnapshot(void);
+extern void ResetSnapBuildExportSnapshotState(void);
 
 extern SnapBuildState SnapBuildCurrentState(SnapBuild *snapstate);
 extern Snapshot SnapBuildGetOrBuildSnapshot(SnapBuild *builder,
-- 
1.8.3.1