Re: Single transaction in the tablesync worker?

2021-02-25 Thread Masahiko Sawada
On Thu, Feb 25, 2021 at 1:52 PM Amit Kapila  wrote:
>
> On Wed, Feb 24, 2021 at 5:55 PM Amit Kapila  wrote:
> >
> > On Wed, Feb 24, 2021 at 12:47 PM Masahiko Sawada  
> > wrote:
> > >
> > > On Fri, Feb 12, 2021 at 2:49 PM Amit Kapila  
> > > wrote:
> > > >
> > > >
> > > > Thanks, I have pushed the fix and the latest run of 'thorntail' has 
> > > > passed.
> > >
> > > I got the following WARNING message from a logical replication apply 
> > > worker:
> > >
> > > WARNING:  relcache reference leak: relation "pg_subscription_rel" not 
> > > closed
> > >
> > > The cause of this is that GetSubscriptionRelState() doesn't close the
> > > relation in SUBREL_STATE_UNKNOWN case. It seems that commit ce0fdbfe9
> > > forgot to close it. I've attached the patch to fix this issue.
> > >
> >
> > Thanks for the report and fix. Your patch LGTM. I'll push it tomorrow.
> >
>
> Pushed!

Thank you!

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/




Re: Single transaction in the tablesync worker?

2021-02-24 Thread Amit Kapila
On Wed, Feb 24, 2021 at 5:55 PM Amit Kapila  wrote:
>
> On Wed, Feb 24, 2021 at 12:47 PM Masahiko Sawada  
> wrote:
> >
> > On Fri, Feb 12, 2021 at 2:49 PM Amit Kapila  wrote:
> > >
> > >
> > > Thanks, I have pushed the fix and the latest run of 'thorntail' has 
> > > passed.
> >
> > I got the following WARNING message from a logical replication apply worker:
> >
> > WARNING:  relcache reference leak: relation "pg_subscription_rel" not closed
> >
> > The cause of this is that GetSubscriptionRelState() doesn't close the
> > relation in SUBREL_STATE_UNKNOWN case. It seems that commit ce0fdbfe9
> > forgot to close it. I've attached the patch to fix this issue.
> >
>
> Thanks for the report and fix. Your patch LGTM. I'll push it tomorrow.
>

Pushed!

-- 
With Regards,
Amit Kapila.




Re: Single transaction in the tablesync worker?

2021-02-24 Thread Amit Kapila
On Wed, Feb 24, 2021 at 12:47 PM Masahiko Sawada  wrote:
>
> On Fri, Feb 12, 2021 at 2:49 PM Amit Kapila  wrote:
> >
> >
> > Thanks, I have pushed the fix and the latest run of 'thorntail' has passed.
>
> I got the following WARNING message from a logical replication apply worker:
>
> WARNING:  relcache reference leak: relation "pg_subscription_rel" not closed
>
> The cause of this is that GetSubscriptionRelState() doesn't close the
> relation in SUBREL_STATE_UNKNOWN case. It seems that commit ce0fdbfe9
> forgot to close it. I've attached the patch to fix this issue.
>

Thanks for the report and fix. Your patch LGTM. I'll push it tomorrow.

-- 
With Regards,
Amit Kapila.




Re: Single transaction in the tablesync worker?

2021-02-23 Thread Masahiko Sawada
On Fri, Feb 12, 2021 at 2:49 PM Amit Kapila  wrote:
>
> On Fri, Feb 12, 2021 at 10:08 AM Ajin Cherian  wrote:
> >
> > On Fri, Feb 12, 2021 at 2:46 PM Amit Kapila  wrote:
> >
> > >
> > > Thanks, I have pushed the patch but getting one failure:
> > > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=thorntail=2021-02-12%2002%3A28%3A12
> > >
> > > The reason seems to be that we are trying to connect and
> > > max_wal_senders is set to zero. I think we can write this without
> > > trying to connect. The attached patch fixes the problem for me. What
> > > do you think?
> >
> > Verified this with installcheck and modified configuration to have
> > wal_level = minimal and max_wal_senders = 0.
> > Tests passed. The changes look good  to me.
> >
>
> Thanks, I have pushed the fix and the latest run of 'thorntail' has passed.

I got the following WARNING message from a logical replication apply worker:

WARNING:  relcache reference leak: relation "pg_subscription_rel" not closed

The cause of this is that GetSubscriptionRelState() doesn't close the
relation in SUBREL_STATE_UNKNOWN case. It seems that commit ce0fdbfe9
forgot to close it. I've attached the patch to fix this issue.

Here is a reproducible step:

1. On both publisher and subscriber:
create table test (a int primary key);

2. On publisher:
create publication test_pub for table test;

3. On subscriber:
create subscription test_sub connection 'dbname=postgres' publication test_pub";
-- wait until table sync finished
drop table test;
create table test (a int primary key);

>From this point, you will get the WARNING message when doing
insert/update/delete/truncate to 'test' table on the publisher.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/


fix_relcache_leak.patch
Description: Binary data


Re: Single transaction in the tablesync worker?

2021-02-11 Thread Amit Kapila
On Fri, Feb 12, 2021 at 10:08 AM Ajin Cherian  wrote:
>
> On Fri, Feb 12, 2021 at 2:46 PM Amit Kapila  wrote:
>
> >
> > Thanks, I have pushed the patch but getting one failure:
> > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=thorntail=2021-02-12%2002%3A28%3A12
> >
> > The reason seems to be that we are trying to connect and
> > max_wal_senders is set to zero. I think we can write this without
> > trying to connect. The attached patch fixes the problem for me. What
> > do you think?
>
> Verified this with installcheck and modified configuration to have
> wal_level = minimal and max_wal_senders = 0.
> Tests passed. The changes look good  to me.
>

Thanks, I have pushed the fix and the latest run of 'thorntail' has passed.

-- 
With Regards,
Amit Kapila.




Re: Single transaction in the tablesync worker?

2021-02-11 Thread Ajin Cherian
On Fri, Feb 12, 2021 at 2:46 PM Amit Kapila  wrote:

>
> Thanks, I have pushed the patch but getting one failure:
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=thorntail=2021-02-12%2002%3A28%3A12
>
> The reason seems to be that we are trying to connect and
> max_wal_senders is set to zero. I think we can write this without
> trying to connect. The attached patch fixes the problem for me. What
> do you think?

Verified this with installcheck and modified configuration to have
wal_level = minimal and max_wal_senders = 0.
Tests passed. The changes look good  to me.

regards,
Ajin Cherian
Fujitsu Australia




Re: Single transaction in the tablesync worker?

2021-02-11 Thread Amit Kapila
On Fri, Feb 12, 2021 at 7:18 AM Ajin Cherian  wrote:
>
> On Thu, Feb 11, 2021 at 10:38 PM Amit Kapila  wrote:
>
> > Okay, attached an updated patch with only that change.
>
> I ran Erik's test suite [1] on this patch overnight and found no
> errors. No more comments from me. The patch looks good.
>

Thanks, I have pushed the patch but getting one failure:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=thorntail=2021-02-12%2002%3A28%3A12

The reason seems to be that we are trying to connect and
max_wal_senders is set to zero. I think we can write this without
trying to connect. The attached patch fixes the problem for me. What
do you think?


-- 
With Regards,
Amit Kapila.


fix_subs_test_1.patch
Description: Binary data


Re: Single transaction in the tablesync worker?

2021-02-11 Thread Ajin Cherian
On Thu, Feb 11, 2021 at 10:38 PM Amit Kapila  wrote:

> Okay, attached an updated patch with only that change.

I ran Erik's test suite [1] on this patch overnight and found no
errors. No more comments from me. The patch looks good.

regards,
Ajin Cherian
Fujitsu Australia

[1]- 
https://www.postgresql.org/message-id/93d02794068482f96d31b002e0eb248d%40xs4all.nl




Re: Single transaction in the tablesync worker?

2021-02-11 Thread Amit Kapila
On Thu, Feb 11, 2021 at 3:32 PM Petr Jelinek
 wrote:
>
> On 11 Feb 2021, at 10:56, Amit Kapila  wrote:
> >
> >> Well, I was thinking it could be NOTICE or LOG to be honest, WARNING seems 
> >> unnecessarily scary for those usecases to me.
> >>
> >
> > I am fine with LOG and will make that change. Do you have any more
> > comments or want to spend more time on this patch before we call it
> > good?
>
> I am good, thanks!
>

Okay, attached an updated patch with only that change.

-- 
With Regards,
Amit Kapila.


v32-0001-Allow-multiple-xacts-during-table-sync-in-logica.patch
Description: Binary data


Re: Single transaction in the tablesync worker?

2021-02-11 Thread Petr Jelinek
On 11 Feb 2021, at 10:56, Amit Kapila  wrote:
> 
> On Thu, Feb 11, 2021 at 3:20 PM Petr Jelinek
>  wrote:
>> 
>> On 11 Feb 2021, at 10:42, Amit Kapila  wrote:
>>> 
>>> On Thu, Feb 11, 2021 at 1:51 PM Petr Jelinek
>>>  wrote:
 
 On 10 Feb 2021, at 06:32, Amit Kapila  wrote:
> 
> On Wed, Feb 10, 2021 at 7:41 AM Peter Smith  wrote:
>> 
>> On Tue, Feb 9, 2021 at 10:38 AM Peter Smith  
>> wrote:
>>> 
>> 
>> PSA v2 of this WalRcvExceResult patch (it is same as v1 but includes
>> some PG doc updates).
>> This applies OK on top of v30 of the main patch.
>> 
> 
> Thanks, I have integrated these changes into the main patch and
> additionally made some changes to comments and docs. I have also fixed
> the function name inconsistency issue you reported and ran pgindent.
 
 One thing:
 
> + else if (res->status == WALRCV_ERROR &&
> +  missing_ok &&
> +  res->sqlstate == ERRCODE_UNDEFINED_OBJECT)
> + {
> + /* WARNING. Error, but missing_ok = true. */
> + ereport(WARNING,
> (errmsg("could not drop the 
> replication slot \"%s\" on publisher",
> slotname),
>  errdetail("The error was: %s", 
> res->err)));
 
 Hmm, why is this WARNING, we mostly call it with missing_ok = true when 
 the slot is not expected to be there, so it does not seem correct to 
 report it as warning?
 
>>> 
>>> WARNING is for the cases where we don't always expect slots to exist
>>> and we don't want to stop the operation due to it. For example, in
>>> DropSubscription, for some of the rel states like (SUBREL_STATE_INIT
>>> and SUBREL_STATE_DATASYNC), the slot won't exist. Similarly, say if we
>>> fail (due to network error) after removing some of the slots, next
>>> time, it will again try to drop already dropped slots and fail. For
>>> these reasons, we need to use WARNING. Similarly for tablesync workers
>>> when we are trying to initially drop the slot there is no certainty
>>> that it exists, so we can't throw ERROR and stop the operation there.
>>> There are other cases like when the table sync worker has finished
>>> syncing the table, there we will raise an ERROR if the slot doesn't
>>> exist. Does this make sense?
>> 
>> Well, I was thinking it could be NOTICE or LOG to be honest, WARNING seems 
>> unnecessarily scary for those usecases to me.
>> 
> 
> I am fine with LOG and will make that change. Do you have any more
> comments or want to spend more time on this patch before we call it
> good?

I am good, thanks!

—
Petr



Re: Single transaction in the tablesync worker?

2021-02-11 Thread Amit Kapila
On Thu, Feb 11, 2021 at 3:20 PM Petr Jelinek
 wrote:
>
> On 11 Feb 2021, at 10:42, Amit Kapila  wrote:
> >
> > On Thu, Feb 11, 2021 at 1:51 PM Petr Jelinek
> >  wrote:
> >>
> >> On 10 Feb 2021, at 06:32, Amit Kapila  wrote:
> >>>
> >>> On Wed, Feb 10, 2021 at 7:41 AM Peter Smith  wrote:
> 
>  On Tue, Feb 9, 2021 at 10:38 AM Peter Smith  
>  wrote:
> >
> 
>  PSA v2 of this WalRcvExceResult patch (it is same as v1 but includes
>  some PG doc updates).
>  This applies OK on top of v30 of the main patch.
> 
> >>>
> >>> Thanks, I have integrated these changes into the main patch and
> >>> additionally made some changes to comments and docs. I have also fixed
> >>> the function name inconsistency issue you reported and ran pgindent.
> >>
> >> One thing:
> >>
> >>> + else if (res->status == WALRCV_ERROR &&
> >>> +  missing_ok &&
> >>> +  res->sqlstate == ERRCODE_UNDEFINED_OBJECT)
> >>> + {
> >>> + /* WARNING. Error, but missing_ok = true. */
> >>> + ereport(WARNING,
> >>>  (errmsg("could not drop the 
> >>> replication slot \"%s\" on publisher",
> >>>  slotname),
> >>>   errdetail("The error was: %s", 
> >>> res->err)));
> >>
> >> Hmm, why is this WARNING, we mostly call it with missing_ok = true when 
> >> the slot is not expected to be there, so it does not seem correct to 
> >> report it as warning?
> >>
> >
> > WARNING is for the cases where we don't always expect slots to exist
> > and we don't want to stop the operation due to it. For example, in
> > DropSubscription, for some of the rel states like (SUBREL_STATE_INIT
> > and SUBREL_STATE_DATASYNC), the slot won't exist. Similarly, say if we
> > fail (due to network error) after removing some of the slots, next
> > time, it will again try to drop already dropped slots and fail. For
> > these reasons, we need to use WARNING. Similarly for tablesync workers
> > when we are trying to initially drop the slot there is no certainty
> > that it exists, so we can't throw ERROR and stop the operation there.
> > There are other cases like when the table sync worker has finished
> > syncing the table, there we will raise an ERROR if the slot doesn't
> > exist. Does this make sense?
>
> Well, I was thinking it could be NOTICE or LOG to be honest, WARNING seems 
> unnecessarily scary for those usecases to me.
>

I am fine with LOG and will make that change. Do you have any more
comments or want to spend more time on this patch before we call it
good?

-- 
With Regards,
Amit Kapila.




Re: Single transaction in the tablesync worker?

2021-02-11 Thread Petr Jelinek
On 11 Feb 2021, at 10:42, Amit Kapila  wrote:
> 
> On Thu, Feb 11, 2021 at 1:51 PM Petr Jelinek
>  wrote:
>> 
>> On 10 Feb 2021, at 06:32, Amit Kapila  wrote:
>>> 
>>> On Wed, Feb 10, 2021 at 7:41 AM Peter Smith  wrote:
 
 On Tue, Feb 9, 2021 at 10:38 AM Peter Smith  wrote:
> 
 
 PSA v2 of this WalRcvExceResult patch (it is same as v1 but includes
 some PG doc updates).
 This applies OK on top of v30 of the main patch.
 
>>> 
>>> Thanks, I have integrated these changes into the main patch and
>>> additionally made some changes to comments and docs. I have also fixed
>>> the function name inconsistency issue you reported and ran pgindent.
>> 
>> One thing:
>> 
>>> + else if (res->status == WALRCV_ERROR &&
>>> +  missing_ok &&
>>> +  res->sqlstate == ERRCODE_UNDEFINED_OBJECT)
>>> + {
>>> + /* WARNING. Error, but missing_ok = true. */
>>> + ereport(WARNING,
>>>  (errmsg("could not drop the 
>>> replication slot \"%s\" on publisher",
>>>  slotname),
>>>   errdetail("The error was: %s", 
>>> res->err)));
>> 
>> Hmm, why is this WARNING, we mostly call it with missing_ok = true when the 
>> slot is not expected to be there, so it does not seem correct to report it 
>> as warning?
>> 
> 
> WARNING is for the cases where we don't always expect slots to exist
> and we don't want to stop the operation due to it. For example, in
> DropSubscription, for some of the rel states like (SUBREL_STATE_INIT
> and SUBREL_STATE_DATASYNC), the slot won't exist. Similarly, say if we
> fail (due to network error) after removing some of the slots, next
> time, it will again try to drop already dropped slots and fail. For
> these reasons, we need to use WARNING. Similarly for tablesync workers
> when we are trying to initially drop the slot there is no certainty
> that it exists, so we can't throw ERROR and stop the operation there.
> There are other cases like when the table sync worker has finished
> syncing the table, there we will raise an ERROR if the slot doesn't
> exist. Does this make sense?

Well, I was thinking it could be NOTICE or LOG to be honest, WARNING seems 
unnecessarily scary for those usecases to me.

—
Petr





Re: Single transaction in the tablesync worker?

2021-02-11 Thread Amit Kapila
On Thu, Feb 11, 2021 at 1:51 PM Petr Jelinek
 wrote:
>
> On 10 Feb 2021, at 06:32, Amit Kapila  wrote:
> >
> > On Wed, Feb 10, 2021 at 7:41 AM Peter Smith  wrote:
> >>
> >> On Tue, Feb 9, 2021 at 10:38 AM Peter Smith  wrote:
> >>>
> >>
> >> PSA v2 of this WalRcvExceResult patch (it is same as v1 but includes
> >> some PG doc updates).
> >> This applies OK on top of v30 of the main patch.
> >>
> >
> > Thanks, I have integrated these changes into the main patch and
> > additionally made some changes to comments and docs. I have also fixed
> > the function name inconsistency issue you reported and ran pgindent.
>
> One thing:
>
> > + else if (res->status == WALRCV_ERROR &&
> > +  missing_ok &&
> > +  res->sqlstate == ERRCODE_UNDEFINED_OBJECT)
> > + {
> > + /* WARNING. Error, but missing_ok = true. */
> > + ereport(WARNING,
> >   (errmsg("could not drop the 
> > replication slot \"%s\" on publisher",
> >   slotname),
> >errdetail("The error was: %s", 
> > res->err)));
>
> Hmm, why is this WARNING, we mostly call it with missing_ok = true when the 
> slot is not expected to be there, so it does not seem correct to report it as 
> warning?
>

WARNING is for the cases where we don't always expect slots to exist
and we don't want to stop the operation due to it. For example, in
DropSubscription, for some of the rel states like (SUBREL_STATE_INIT
and SUBREL_STATE_DATASYNC), the slot won't exist. Similarly, say if we
fail (due to network error) after removing some of the slots, next
time, it will again try to drop already dropped slots and fail. For
these reasons, we need to use WARNING. Similarly for tablesync workers
when we are trying to initially drop the slot there is no certainty
that it exists, so we can't throw ERROR and stop the operation there.
There are other cases like when the table sync worker has finished
syncing the table, there we will raise an ERROR if the slot doesn't
exist. Does this make sense?

-- 
With Regards,
Amit Kapila.




Re: Single transaction in the tablesync worker?

2021-02-11 Thread Petr Jelinek
On 10 Feb 2021, at 06:32, Amit Kapila  wrote:
> 
> On Wed, Feb 10, 2021 at 7:41 AM Peter Smith  wrote:
>> 
>> On Tue, Feb 9, 2021 at 10:38 AM Peter Smith  wrote:
>>> 
>> 
>> PSA v2 of this WalRcvExceResult patch (it is same as v1 but includes
>> some PG doc updates).
>> This applies OK on top of v30 of the main patch.
>> 
> 
> Thanks, I have integrated these changes into the main patch and
> additionally made some changes to comments and docs. I have also fixed
> the function name inconsistency issue you reported and ran pgindent.

One thing:

> + else if (res->status == WALRCV_ERROR &&
> +  missing_ok &&
> +  res->sqlstate == ERRCODE_UNDEFINED_OBJECT)
> + {
> + /* WARNING. Error, but missing_ok = true. */
> + ereport(WARNING,
>   (errmsg("could not drop the replication 
> slot \"%s\" on publisher",
>   slotname),
>errdetail("The error was: %s", 
> res->err)));

Hmm, why is this WARNING, we mostly call it with missing_ok = true when the 
slot is not expected to be there, so it does not seem correct to report it as 
warning?

--
Petr



Re: Single transaction in the tablesync worker?

2021-02-09 Thread Amit Kapila
On Wed, Feb 10, 2021 at 7:41 AM Peter Smith  wrote:
>
> On Tue, Feb 9, 2021 at 10:38 AM Peter Smith  wrote:
> >
>
> PSA v2 of this WalRcvExceResult patch (it is same as v1 but includes
> some PG doc updates).
> This applies OK on top of v30 of the main patch.
>

Thanks, I have integrated these changes into the main patch and
additionally made some changes to comments and docs. I have also fixed
the function name inconsistency issue you reported and ran pgindent.

-- 
With Regards,
Amit Kapila.


v31-0001-Allow-multiple-xacts-during-table-sync-in-logica.patch
Description: Binary data


Re: Single transaction in the tablesync worker?

2021-02-09 Thread Ajin Cherian
On Tue, Feb 9, 2021 at 10:38 AM Peter Smith  wrote:

> PSA an alternative patch. This one adds a new member to
> WalRcvExecResult and so is able to detect the "slot does not exist"
> error. This patch also applies on top of V28, if you want it.

Did some testing with this patch on top of v29. I could see that now,
while dropping the subscription, if the tablesync slot does not exist
on the publisher, then it gives a warning
but the command does not fail.

postgres=# CREATE SUBSCRIPTION tap_sub CONNECTION 'host=localhost
dbname=postgres port=6972' PUBLICATION tap_pub WITH (enabled = false);
NOTICE:  created replication slot "tap_sub" on publisher
CREATE SUBSCRIPTION
postgres=# ALTER SUBSCRIPTION tap_sub enable;
ALTER SUBSCRIPTION
postgres=# ALTER SUBSCRIPTION tap_sub disable;
ALTER SUBSCRIPTION
=== here, the tablesync slot exists on the publisher but I go and
=== manually drop it.

postgres=# drop subscription tap_sub;
WARNING:  could not drop the replication slot
"pg_16401_sync_16389_6927117142022745645" on publisher
DETAIL:  The error was: ERROR:  replication slot
"pg_16401_sync_16389_6927117142022745645" does not exist
NOTICE:  dropped replication slot "tap_sub" on publisher
DROP SUBSCRIPTION

I have a minor comment on the error message, the "The error was:"
seems a bit redundant here. Maybe remove it? So that it looks like:

WARNING:  could not drop the replication slot
"pg_16401_sync_16389_6927117142022745645" on publisher
DETAIL:  ERROR:  replication slot
"pg_16401_sync_16389_6927117142022745645" does not exist

regards,
Ajin Cherian
Fujitsu Australia




Re: Single transaction in the tablesync worker?

2021-02-09 Thread Peter Smith
On Tue, Feb 9, 2021 at 10:38 AM Peter Smith  wrote:
>
> On Mon, Feb 8, 2021 at 11:42 AM Peter Smith  wrote:
> >
> > On Sun, Feb 7, 2021 at 2:38 PM Peter Smith  wrote:
> > >
> > > On Sat, Feb 6, 2021 at 2:10 AM Petr Jelinek
> > >  wrote:
> > > >
> > > > Hi,
> > > >
> > > > Some minor comments about code:
> > > >
> > > > > + else if (res->status == WALRCV_ERROR && missing_ok)
> > > > > + {
> > > > > + /* WARNING. Error, but missing_ok = true. */
> > > > > + ereport(WARNING,
> > > >
> > > > I wonder if we need to add error code to the WalRcvExecResult and check
> > > > for the appropriate ones here. Because this can for example return error
> > > > because of timeout, not because slot is missing. Not sure if it matters
> > > > for current callers though (but then maybe don't call the param
> > > > missign_ok?).
> > >
> > > You are right. The way we are using this function has evolved beyond
> > > the original intention.
> > > Probably renaming the param to something like "error_ok" would be more
> > > appropriate now.
> > >
> >
> > PSA a patch (apply on top of V28) to change the misleading param name.
> >
>
> PSA an alternative patch. This one adds a new member to
> WalRcvExecResult and so is able to detect the "slot does not exist"
> error. This patch also applies on top of V28, if you want it.
>

PSA v2 of this WalRcvExceResult patch (it is same as v1 but includes
some PG doc updates).
This applies OK on top of v30 of the main patch.

--
Kind Regards,
Peter Smith.
Fujitsu Australia


v2-0001-ReplicationSlotDropAtPubNode-detect-slot-does-not.patch
Description: Binary data


Re: Single transaction in the tablesync worker?

2021-02-09 Thread Peter Smith
On Tue, Feb 9, 2021 at 8:32 PM Amit Kapila  wrote:
>
> On Tue, Feb 9, 2021 at 12:02 PM Peter Smith  wrote:
> >
> > Here are my feedback comments for the V29 patch.
> >
>
> Thanks.
>
> >
> > 3.
> > Previously the tablesync origin name format was encapsulated in a
> > common function. IMO it was cleaner/safer how it was before, instead
> > of the same "pg_%u_%u" cut/paste and scattered in many places.
> > (same comment applies multiple places, in this file and in tablesync.c)

OK. I confirmed it is fixed in V30.

But I noticed that the new function name is not quite consistent with
existing function for slot name. e.g.
ReplicationSlotNameForTablesync versus
ReplicationOriginNameForTableSync (see "TableSync" instead of
"Tablesync")

--
Kind Regards,
Peter Smith.
Fujitsu Australia




Re: Single transaction in the tablesync worker?

2021-02-09 Thread Amit Kapila
On Tue, Feb 9, 2021 at 12:02 PM Peter Smith  wrote:
>
> Here are my feedback comments for the V29 patch.
>
> 
>
> FILE: logical-replication.sgml
>
> +slots have generated names:
> pg_%u_sync_%u_%llu
> +(parameters: Subscription oid,
> +Table relid, system
> identifiersysid)
> +   
>
> 1.
> There is a missing space before the sysid parameter.
>
> =
>
> FILE: subscriptioncmds.c
>
> + * SUBREL_STATE_FINISHEDCOPY. The apply worker can also
> + * concurrently try to drop the origin and by this time the
> + * origin might be already removed. For these reasons,
> + * passing missing_ok = true from here.
> + */
> + snprintf(originname, sizeof(originname), "pg_%u_%u", sub->oid, relid);
> + replorigin_drop_by_name(originname, true, false);
> + }
>
> 2.
> Don't really need to say "from here".
> (same comment applies multiple places, in this file and in tablesync.c)
>
> 3.
> Previously the tablesync origin name format was encapsulated in a
> common function. IMO it was cleaner/safer how it was before, instead
> of the same "pg_%u_%u" cut/paste and scattered in many places.
> (same comment applies multiple places, in this file and in tablesync.c)
>

Fixed all the three above comments.

> 4.
> Calls like replorigin_drop_by_name(originname, true, false); make it
> unnecessarily hard to read code when the boolean params are neither
> named as variables nor commented. I noticed on another thread [et0205]
> there was an idea that having no name/comments is fine because anyway
> it is not difficult to figure out when using a "modern IDE", but since
> my review tools are only "vi" and "meld" I beg to differ with that
> justification.
> (same comment applies multiple places, in this file and in tablesync.c)
>

Already responded to it separately. I went ahead and removed such
comments from other places in the patch.

> [et0205] 
> https://www.postgresql.org/message-id/c1d9833f-eeeb-40d5-89ba-87674e1b7ba3%40www.fastmail.com
>
> =
>
> FILE: tablesync.c
>
> 5.
> Previously there was a function tablesync_replorigin_drop which was
> encapsulating the tablesync origin name formatting. I thought that was
> better than the V29 code which now has the same formatting scattered
> over many places.
> (same comment applies for worker_internal.h)
>

I am not sure what different you are expecting here than point-3?

> + * Determine the tablesync slot name.
> + *
> + * The name must not exceed NAMEDATALEN - 1 because of remote node 
> constraints
> + * on slot name length. We do append system_identifier to avoid slot_name
> + * collision with subscriptions in other clusters. With current scheme
> + * pg_%u_sync_%u_UINT64_FORMAT (3 + 10 + 6 + 10 + 20 + '\0'), the maximum
> + * length of slot_name will be 50.
> + *
> + * The returned slot name is either:
> + * - stored in the supplied buffer (syncslotname), or
> + * - palloc'ed in current memory context (if syncslotname = NULL).
> + *
> + * Note: We don't use the subscription slot name as part of tablesync slot 
> name
> + * because we are responsible for cleaning up these slots and it could become
> + * impossible to recalculate what name to cleanup if the subscription slot 
> name
> + * had changed.
> + */
> +char *
> +ReplicationSlotNameForTablesync(Oid suboid, Oid relid, char
> syncslotname[NAMEDATALEN])
> +{
> + if (syncslotname)
> + sprintf(syncslotname, "pg_%u_sync_%u_" UINT64_FORMAT, suboid, relid,
> + GetSystemIdentifier());
> + else
> + syncslotname = psprintf("pg_%u_sync_%u_" UINT64_FORMAT, suboid, relid,
> + GetSystemIdentifier());
> +
> + return syncslotname;
> +}
>
> 6.
> "We do append" --> "We append"
> "With current scheme" -> "With the current scheme"
>

Fixed.

> 7.
> Maybe consider to just assign GetSystemIdentifier() to a static
> instead of calling that function for every slot?
> static uint64 sysid = GetSystemIdentifier();
> IIUC the sysid value is never going to change for a process, right?
>

Already responded.

> FILE: alter_subscription.sgml
>
> 8.
> +  
> +   Commands ALTER SUBSCRIPTION ... REFRESH .. and
> +   ALTER SUBSCRIPTION ... SET PUBLICATION .. with refresh
> +   option as true cannot be executed inside a transaction block.
> +  
>
> My guess is those two lots of double dots ("..") were probably meant
> to be ellipsis ("...")
>

Fixed, for the first one I completed the command by adding PUBLICATION.

>
> When looking at the DropSubscription code I noticed that there is a
> small difference between the HEAD code and the V29 code when slot_name
> = NONE.
>
> HEAD does
> --
> if (!slotname)
> {
> table_close(rel, NoLock);
> return;
> }
> --
>
> V29 does
> --
> if (!slotname)
> {
> /* be tidy */
> list_free(rstates);
> return;
> }
> --
>
> Isn't the V29 code missing doing a table_close(rel, NoLock) there?
>

Yes, good catch. Fixed.

-- 
With Regards,
Amit Kapila.


v30-0001-Make-pg_replication_origin_drop-safe-against-con.patch

Re: Single transaction in the tablesync worker?

2021-02-09 Thread Amit Kapila
On Tue, Feb 9, 2021 at 1:37 PM Peter Smith  wrote:
>
> Looking at the V29 style tablesync slot names now they appear like this:
>
> WARNING:  could not drop tablesync replication slot
> "pg_16397_sync_16389_6927117142022745645"
> That is in the order subid +  relid + sysid
>
> Now that I see it in a message it seems a bit strange with the sysid
> just tacked onto the end like that.
>
> I am wondering if reordering of parent to child might be more natural.
> e.g sysid + subid + relid gives a more intuitive name IMO.
>
> So in this example it would be "pg_sync_6927117142022745645_16397_16389"
>

I have kept the order based on the importance of each parameter. Say
when the user sees this message in the server log of the subscriber
either for the purpose of tracking the origins progress or for errors,
the sysid parameter won't be of much use and they will be mostly
looking at subid and relid. OTOH, if due to some reason this parameter
appears in the publisher logs then sysid might be helpful.

Petr, anyone else, do you have any opinion on this matter?

-- 
With Regards,
Amit Kapila.




Re: Single transaction in the tablesync worker?

2021-02-09 Thread Amit Kapila
On Tue, Feb 9, 2021 at 12:02 PM Peter Smith  wrote:
>
> Here are my feedback comments for the V29 patch.
>

Thanks.

>
> 3.
> Previously the tablesync origin name format was encapsulated in a
> common function. IMO it was cleaner/safer how it was before, instead
> of the same "pg_%u_%u" cut/paste and scattered in many places.
> (same comment applies multiple places, in this file and in tablesync.c)
>
> 4.
> Calls like replorigin_drop_by_name(originname, true, false); make it
> unnecessarily hard to read code when the boolean params are neither
> named as variables nor commented. I noticed on another thread [et0205]
> there was an idea that having no name/comments is fine because anyway
> it is not difficult to figure out when using a "modern IDE", but since
> my review tools are only "vi" and "meld" I beg to differ with that
> justification.
> (same comment applies multiple places, in this file and in tablesync.c)
>

It would be a bit convenient for you but for most others, I think it
would be noise. Personally, I find the code more readable without such
name comments, it just breaks the flow of code unless you want to
study in detail the value of each param.

> [et0205] 
> https://www.postgresql.org/message-id/c1d9833f-eeeb-40d5-89ba-87674e1b7ba3%40www.fastmail.com
>
> =
>
> FILE: tablesync.c
>
> 5.
> Previously there was a function tablesync_replorigin_drop which was
> encapsulating the tablesync origin name formatting. I thought that was
> better than the V29 code which now has the same formatting scattered
> over many places.
> (same comment applies for worker_internal.h)
>

Isn't this the same as what you want to say in point-3?

>
> 7.
> Maybe consider to just assign GetSystemIdentifier() to a static
> instead of calling that function for every slot?
> static uint64 sysid = GetSystemIdentifier();
> IIUC the sysid value is never going to change for a process, right?
>

That's right but I am not sure if there is much value in saving one
call here by introducing extra variable.

I'll fix other comments raised by you.

-- 
With Regards,
Amit Kapila.




Re: Single transaction in the tablesync worker?

2021-02-09 Thread Peter Smith
When looking at the DropSubscription code I noticed that there is a
small difference between the HEAD code and the V29 code when slot_name
= NONE.

HEAD does
--
if (!slotname)
{
table_close(rel, NoLock);
return;
}
--

V29 does
--
if (!slotname)
{
/* be tidy */
list_free(rstates);
return;
}
--

Isn't the V29 code missing doing a table_close(rel, NoLock) there?

--
Kind Regards,
Peter Smith.
Fujitsu Australia




Re: Single transaction in the tablesync worker?

2021-02-09 Thread Peter Smith
Looking at the V29 style tablesync slot names now they appear like this:

WARNING:  could not drop tablesync replication slot
"pg_16397_sync_16389_6927117142022745645"
That is in the order subid +  relid + sysid

Now that I see it in a message it seems a bit strange with the sysid
just tacked onto the end like that.

I am wondering if reordering of parent to child might be more natural.
e.g sysid + subid + relid gives a more intuitive name IMO.

So in this example it would be "pg_sync_6927117142022745645_16397_16389"

Thoughts?


Kind Regards,
Peter Smith
Fujitsu Australia




Re: Single transaction in the tablesync worker?

2021-02-08 Thread Peter Smith
More V29 Feedback

FILE: alter_subscription.sgml

8.
+  
+   Commands ALTER SUBSCRIPTION ... REFRESH .. and
+   ALTER SUBSCRIPTION ... SET PUBLICATION .. with refresh
+   option as true cannot be executed inside a transaction block.
+  

My guess is those two lots of double dots ("..") were probably meant
to be ellipsis ("...")


Kind Regards,
Peter Smith.
Fujitsu Australia




Re: Single transaction in the tablesync worker?

2021-02-08 Thread Peter Smith
Here are my feedback comments for the V29 patch.



FILE: logical-replication.sgml

+slots have generated names:
pg_%u_sync_%u_%llu
+(parameters: Subscription oid,
+Table relid, system
identifiersysid)
+   

1.
There is a missing space before the sysid parameter.

=

FILE: subscriptioncmds.c

+ * SUBREL_STATE_FINISHEDCOPY. The apply worker can also
+ * concurrently try to drop the origin and by this time the
+ * origin might be already removed. For these reasons,
+ * passing missing_ok = true from here.
+ */
+ snprintf(originname, sizeof(originname), "pg_%u_%u", sub->oid, relid);
+ replorigin_drop_by_name(originname, true, false);
+ }

2.
Don't really need to say "from here".
(same comment applies multiple places, in this file and in tablesync.c)

3.
Previously the tablesync origin name format was encapsulated in a
common function. IMO it was cleaner/safer how it was before, instead
of the same "pg_%u_%u" cut/paste and scattered in many places.
(same comment applies multiple places, in this file and in tablesync.c)

4.
Calls like replorigin_drop_by_name(originname, true, false); make it
unnecessarily hard to read code when the boolean params are neither
named as variables nor commented. I noticed on another thread [et0205]
there was an idea that having no name/comments is fine because anyway
it is not difficult to figure out when using a "modern IDE", but since
my review tools are only "vi" and "meld" I beg to differ with that
justification.
(same comment applies multiple places, in this file and in tablesync.c)

[et0205] 
https://www.postgresql.org/message-id/c1d9833f-eeeb-40d5-89ba-87674e1b7ba3%40www.fastmail.com

=

FILE: tablesync.c

5.
Previously there was a function tablesync_replorigin_drop which was
encapsulating the tablesync origin name formatting. I thought that was
better than the V29 code which now has the same formatting scattered
over many places.
(same comment applies for worker_internal.h)

+ * Determine the tablesync slot name.
+ *
+ * The name must not exceed NAMEDATALEN - 1 because of remote node constraints
+ * on slot name length. We do append system_identifier to avoid slot_name
+ * collision with subscriptions in other clusters. With current scheme
+ * pg_%u_sync_%u_UINT64_FORMAT (3 + 10 + 6 + 10 + 20 + '\0'), the maximum
+ * length of slot_name will be 50.
+ *
+ * The returned slot name is either:
+ * - stored in the supplied buffer (syncslotname), or
+ * - palloc'ed in current memory context (if syncslotname = NULL).
+ *
+ * Note: We don't use the subscription slot name as part of tablesync slot name
+ * because we are responsible for cleaning up these slots and it could become
+ * impossible to recalculate what name to cleanup if the subscription slot name
+ * had changed.
+ */
+char *
+ReplicationSlotNameForTablesync(Oid suboid, Oid relid, char
syncslotname[NAMEDATALEN])
+{
+ if (syncslotname)
+ sprintf(syncslotname, "pg_%u_sync_%u_" UINT64_FORMAT, suboid, relid,
+ GetSystemIdentifier());
+ else
+ syncslotname = psprintf("pg_%u_sync_%u_" UINT64_FORMAT, suboid, relid,
+ GetSystemIdentifier());
+
+ return syncslotname;
+}

6.
"We do append" --> "We append"
"With current scheme" -> "With the current scheme"

7.
Maybe consider to just assign GetSystemIdentifier() to a static
instead of calling that function for every slot?
static uint64 sysid = GetSystemIdentifier();
IIUC the sysid value is never going to change for a process, right?

--
Kind Regards,
Peter Smith.
Fujitsu Australia

On Mon, Feb 8, 2021 at 9:59 PM Amit Kapila  wrote:
>
> On Fri, Feb 5, 2021 at 8:40 PM Petr Jelinek
>  wrote:
> >
> > Hi,
> >
> > We had a bit high-level discussion about this patches with Amit
> > off-list, so I decided to also take a look at the actual code.
> >
>
> Thanks for the discussion and a follow-up review.
>
> > My main concern originally was the potential for left-over slots on
> > publisher, but I think the state now is relatively okay, with couple of
> > corner cases that are documented and don't seem much worse than the main
> > slot.
> >
> > I wonder if we should mention the max_slot_wal_keep_size GUC in the
> > table sync docs though.
> >
>
> I have added the reference of this in Alter Subscription where we
> mentioned the risk of leftover slots. Let me know if you have
> something else in mind?
>
> > Another thing that might need documentation is that the the visibility
> > of changes done by table sync is not anymore isolated in that table
> > contents will show intermediate progress to other backends, rather than
> > switching from nothing to state consistent with rest of replication.
> >
>
> Agreed and updated the docs accordingly.
>
> >
> > Some minor comments about code:
> >
> > > + else if (res->status == WALRCV_ERROR && missing_ok)
> > > + {
> > > + /* WARNING. Error, but missing_ok = true. */
> > > + ereport(WARNING,
> >
> > I wonder if we need to add error code to the WalRcvExecResult 

RE: Single transaction in the tablesync worker?

2021-02-08 Thread osumi.takami...@fujitsu.com
On Mon, Feb 8, 2021 8:04 PM Amit Kapila  wrote:
> On Mon, Feb 8, 2021 at 12:22 PM osumi.takami...@fujitsu.com
>  wrote:
> > On Monday, February 8, 2021 1:44 PM osumi.takami...@fujitsu.com
> > 
> > > On Mon, Feb 8, 2021 11:36 AM Peter Smith 
> > > wrote:
> > > > 2. For the 004 test case I know the test is needing some PK
> > > > constraint violation # Check if DROP SUBSCRIPTION cleans up slots
> > > > on the publisher side # when the subscriber is stuck on data copy
> > > > for constraint
> > > >
> > > > But it is not clear to me what was the exact cause of that PK
> > > > violation. I think you must be relying on data that is leftover
> > > > from some previous test case but I am not sure which one. Can you
> > > > make the comment more detailed to say
> > > > *how* the PK violation is happening - e.g something to say which
> > > > rows, in which table, and inserted by who?
> > > I added some comments to clarify how the PK violation happens.
> > > Please have a look.
> > Sorry, I had a one typo in the tests of subscription.sql in v2.
> > I used 'foo' for the first test of "ALTER SUBSCRIPTION mytest SET
> > PUBLICATION foo WITH (refresh = true) in v02", but I should have used
> 'mypub' to make this test clearly independent from other previous tests.
> > Attached the fixed version.
> >
> 
> Thanks. I have integrated this into the main patch with minor modifications in
> the comments. The main change I have done is to remove the test that was
> testing that there are two slots remaining after the initial sync failure. 
> This is
> because on restart of tablesync worker we again try to drop the slot so we
> can't guarantee that the tablesync slot would be remaining. I think this is a
> timing issue so it might not have occurred on your machine but I could
> reproduce that by repeated runs of the tests provided by you.
OK. I understand. Thank you so much that your modified
and integrated it into the main patch.


Best Regards,
Takamichi Osumi


Re: Single transaction in the tablesync worker?

2021-02-08 Thread Peter Smith
On Mon, Feb 8, 2021 at 11:42 AM Peter Smith  wrote:
>
> On Sun, Feb 7, 2021 at 2:38 PM Peter Smith  wrote:
> >
> > On Sat, Feb 6, 2021 at 2:10 AM Petr Jelinek
> >  wrote:
> > >
> > > Hi,
> > >
> > > Some minor comments about code:
> > >
> > > > + else if (res->status == WALRCV_ERROR && missing_ok)
> > > > + {
> > > > + /* WARNING. Error, but missing_ok = true. */
> > > > + ereport(WARNING,
> > >
> > > I wonder if we need to add error code to the WalRcvExecResult and check
> > > for the appropriate ones here. Because this can for example return error
> > > because of timeout, not because slot is missing. Not sure if it matters
> > > for current callers though (but then maybe don't call the param
> > > missign_ok?).
> >
> > You are right. The way we are using this function has evolved beyond
> > the original intention.
> > Probably renaming the param to something like "error_ok" would be more
> > appropriate now.
> >
>
> PSA a patch (apply on top of V28) to change the misleading param name.
>

PSA an alternative patch. This one adds a new member to
WalRcvExecResult and so is able to detect the "slot does not exist"
error. This patch also applies on top of V28, if you want it.

--
Kind Regards,
Peter Smith.
Fujitsu Australia


v1-0001-ReplicationSlotDropAtPubNode-detect-slot-does-not.patch
Description: Binary data


Re: Single transaction in the tablesync worker?

2021-02-08 Thread Amit Kapila
On Mon, Feb 8, 2021 at 12:22 PM osumi.takami...@fujitsu.com
 wrote:
>
> On Monday, February 8, 2021 1:44 PM osumi.takami...@fujitsu.com 
> 
> > On Mon, Feb 8, 2021 11:36 AM Peter Smith 
> > wrote:
> > > 2. For the 004 test case I know the test is needing some PK constraint
> > > violation # Check if DROP SUBSCRIPTION cleans up slots on the
> > > publisher side # when the subscriber is stuck on data copy for
> > > constraint
> > >
> > > But it is not clear to me what was the exact cause of that PK
> > > violation. I think you must be relying on data that is leftover from
> > > some previous test case but I am not sure which one. Can you make the
> > > comment more detailed to say
> > > *how* the PK violation is happening - e.g something to say which rows,
> > > in which table, and inserted by who?
> > I added some comments to clarify how the PK violation happens.
> > Please have a look.
> Sorry, I had a one typo in the tests of subscription.sql in v2.
> I used 'foo' for the first test of "ALTER SUBSCRIPTION mytest SET PUBLICATION 
> foo WITH (refresh = true) in v02",
> but I should have used 'mypub' to make this test clearly independent from 
> other previous tests.
> Attached the fixed version.
>

Thanks. I have integrated this into the main patch with minor
modifications in the comments. The main change I have done is to
remove the test that was testing that there are two slots remaining
after the initial sync failure. This is because on restart of
tablesync worker we again try to drop the slot so we can't guarantee
that the tablesync slot would be remaining. I think this is a timing
issue so it might not have occurred on your machine but I could
reproduce that by repeated runs of the tests provided by you.

-- 
With Regards,
Amit Kapila.




Re: Single transaction in the tablesync worker?

2021-02-08 Thread Amit Kapila
On Fri, Feb 5, 2021 at 8:40 PM Petr Jelinek
 wrote:
>
> Hi,
>
> We had a bit high-level discussion about this patches with Amit
> off-list, so I decided to also take a look at the actual code.
>

Thanks for the discussion and a follow-up review.

> My main concern originally was the potential for left-over slots on
> publisher, but I think the state now is relatively okay, with couple of
> corner cases that are documented and don't seem much worse than the main
> slot.
>
> I wonder if we should mention the max_slot_wal_keep_size GUC in the
> table sync docs though.
>

I have added the reference of this in Alter Subscription where we
mentioned the risk of leftover slots. Let me know if you have
something else in mind?

> Another thing that might need documentation is that the the visibility
> of changes done by table sync is not anymore isolated in that table
> contents will show intermediate progress to other backends, rather than
> switching from nothing to state consistent with rest of replication.
>

Agreed and updated the docs accordingly.

>
> Some minor comments about code:
>
> > + else if (res->status == WALRCV_ERROR && missing_ok)
> > + {
> > + /* WARNING. Error, but missing_ok = true. */
> > + ereport(WARNING,
>
> I wonder if we need to add error code to the WalRcvExecResult and check
> for the appropriate ones here. Because this can for example return error
> because of timeout, not because slot is missing.
>

I think there are both pros and cons of distinguishing the error
("slot doesnot exist" from others). The benefit is if there a network
glitch then the user can probably retry the commands Alter/Drop and it
will be successful next time. OTOH, say the network is broken for a
long time and the user wants to proceed but there won't be any way to
proceed for Alter Subscription ... Refresh or Drop Command. So by
giving WARNING at least we can provide a way to proceed and then they
can drop such slots later. We have mentioned this in docs as well. I
think we can go either way here, let me know what do you think is a
better way?

> Not sure if it matters
> for current callers though (but then maybe don't call the param
> missign_ok?).
>

Sure, if we decide not to change the behavior as suggested by you then
this makes sense.

>
> > +ReplicationSlotNameForTablesync(Oid suboid, Oid relid, char 
> > syncslotname[NAMEDATALEN])
> > +{
> > + if (syncslotname)
> > + sprintf(syncslotname, "pg_%u_sync_%u", suboid, relid);
> > + else
> > + syncslotname = psprintf("pg_%u_sync_%u", suboid, relid);
> > +
> > + return syncslotname;
> > +}
>
> Given that we are now explicitly dropping slots, what happens here if we
> have 2 different downstreams that happen to get same suboid and reloid,
> will one of the drop the slot of the other one? Previously with the
> cleanup being left to temp slot we'd at maximum got error when creating
> it but with the new logic in LogicalRepSyncTableStart it feels like we
> could get into situation where 2 downstreams are fighting over slot no?
>

As discussed, added system_identifier to distinguish subscriptions
between different clusters.

Apart from fixing the above comment, I have integrated it with the new
replorigin_drop_by_name() API being discussed in the thread [1] and
posted that patch just for ease. I have also integrated Osumi-San's
test case patch with minor modifications.

[1] - 
https://www.postgresql.org/message-id/CAA4eK1L7mLhY%3DwyCB0qsEGUpfzWfncDSS9_0a4Co%2BN0GUyNGNQ%40mail.gmail.com

-- 
With Regards,
Amit Kapila.


v29-0001-Make-pg_replication_origin_drop-safe-against-con.patch
Description: Binary data


v29-0002-Allow-multiple-xacts-during-table-sync-in-logica.patch
Description: Binary data


RE: Single transaction in the tablesync worker?

2021-02-07 Thread osumi.takami...@fujitsu.com
On Monday, February 8, 2021 1:44 PM osumi.takami...@fujitsu.com 

> On Mon, Feb 8, 2021 12:40 PM Amit Kapila 
> wrote:
> > On Mon, Feb 8, 2021 at 8:06 AM Peter Smith 
> > wrote:
> > >
> > > On Sat, Feb 6, 2021 at 6:30 PM osumi.takami...@fujitsu.com
> > >  wrote:
> > > >
> > > > > I have another idea for a test case: What if we write a test
> > > > > such that it fails PK violation on copy and then drop the 
> > > > > subscription.
> > > > > Then check there shouldn't be any dangling slot on the publisher?
> > > > > This is similar to a test in subscription/t/004_sync.pl, we can
> > > > > use some of that framework but have a separate test for this.
> > > > I've added this PK violation test to the attached tests.
> > > > The patch works with v28 and made no failure during regression tests.
> > > >
> > >
> > > I checked this patch. It applied cleanly on top of V28, and all
> > > tests passed
> > OK.
> > >
> > > Here are two feedback comments.
> > >
> > > 1. For the regression test there is 2 x SQL and 1 x function test. I
> > > thought to cover all the combinations there should be another
> > > function test. e.g.
> > > Tests ALTER … REFRESH
> > > Tests ALTER …. (refresh = true)
> > > Tests ALTER … (refresh = true) in a function Tests ALTER … REFRESH
> > > in a function  <== this combination is not being testing ??
> > >
> >
> > I am not sure whether there is much value in adding more to this set
> > of negative test cases unless it really covers a different code path
> > which I think won't happen if we add more tests here.
> Yeah, I agree. Accordingly, I didn't fix that part.
> 
> 
> On Mon, Feb 8, 2021 11:36 AM Peter Smith 
> wrote:
> > 2. For the 004 test case I know the test is needing some PK constraint
> > violation # Check if DROP SUBSCRIPTION cleans up slots on the
> > publisher side # when the subscriber is stuck on data copy for
> > constraint
> >
> > But it is not clear to me what was the exact cause of that PK
> > violation. I think you must be relying on data that is leftover from
> > some previous test case but I am not sure which one. Can you make the
> > comment more detailed to say
> > *how* the PK violation is happening - e.g something to say which rows,
> > in which table, and inserted by who?
> I added some comments to clarify how the PK violation happens.
> Please have a look.
Sorry, I had a one typo in the tests of subscription.sql in v2.
I used 'foo' for the first test of "ALTER SUBSCRIPTION mytest SET PUBLICATION 
foo WITH (refresh = true) in v02",
but I should have used 'mypub' to make this test clearly independent from other 
previous tests.
Attached the fixed version.

Best Regards,
Takamichi Osumi



refresh_and_pk_violation_testsets_v03.patch
Description: refresh_and_pk_violation_testsets_v03.patch


RE: Single transaction in the tablesync worker?

2021-02-07 Thread osumi.takami...@fujitsu.com
Hello


On Mon, Feb 8, 2021 12:40 PM Amit Kapila  wrote:
> On Mon, Feb 8, 2021 at 8:06 AM Peter Smith 
> wrote:
> >
> > On Sat, Feb 6, 2021 at 6:30 PM osumi.takami...@fujitsu.com
> >  wrote:
> > >
> > > > I have another idea for a test case: What if we write a test such
> > > > that it fails PK violation on copy and then drop the subscription.
> > > > Then check there shouldn't be any dangling slot on the publisher?
> > > > This is similar to a test in subscription/t/004_sync.pl, we can
> > > > use some of that framework but have a separate test for this.
> > > I've added this PK violation test to the attached tests.
> > > The patch works with v28 and made no failure during regression tests.
> > >
> >
> > I checked this patch. It applied cleanly on top of V28, and all tests passed
> OK.
> >
> > Here are two feedback comments.
> >
> > 1. For the regression test there is 2 x SQL and 1 x function test. I
> > thought to cover all the combinations there should be another function
> > test. e.g.
> > Tests ALTER … REFRESH
> > Tests ALTER …. (refresh = true)
> > Tests ALTER … (refresh = true) in a function Tests ALTER … REFRESH in
> > a function  <== this combination is not being testing ??
> >
> 
> I am not sure whether there is much value in adding more to this set of
> negative test cases unless it really covers a different code path which I 
> think
> won't happen if we add more tests here.
Yeah, I agree. Accordingly, I didn't fix that part.


On Mon, Feb 8, 2021 11:36 AM Peter Smith  wrote:
> 2. For the 004 test case I know the test is needing some PK constraint
> violation # Check if DROP SUBSCRIPTION cleans up slots on the publisher
> side # when the subscriber is stuck on data copy for constraint
> 
> But it is not clear to me what was the exact cause of that PK violation. I 
> think
> you must be relying on data that is leftover from some previous test case but
> I am not sure which one. Can you make the comment more detailed to say
> *how* the PK violation is happening - e.g something to say which rows, in
> which table, and inserted by who?
I added some comments to clarify how the PK violation happens.
Please have a look.


Best Regards,
Takamichi Osumi


refresh_and_pk_violation_testsets_v02.patch
Description: refresh_and_pk_violation_testsets_v02.patch


Re: Single transaction in the tablesync worker?

2021-02-07 Thread Amit Kapila
On Mon, Feb 8, 2021 at 8:06 AM Peter Smith  wrote:
>
> On Sat, Feb 6, 2021 at 6:30 PM osumi.takami...@fujitsu.com
>  wrote:
> >
> > > I have another idea for a test case: What if we write a test such that it 
> > > fails PK
> > > violation on copy and then drop the subscription. Then check there 
> > > shouldn't
> > > be any dangling slot on the publisher? This is similar to a test in
> > > subscription/t/004_sync.pl, we can use some of that framework but have a
> > > separate test for this.
> > I've added this PK violation test to the attached tests.
> > The patch works with v28 and made no failure during regression tests.
> >
>
> I checked this patch. It applied cleanly on top of V28, and all tests passed 
> OK.
>
> Here are two feedback comments.
>
> 1. For the regression test there is 2 x SQL and 1 x function test. I
> thought to cover all the combinations there should be another function
> test. e.g.
> Tests ALTER … REFRESH
> Tests ALTER …. (refresh = true)
> Tests ALTER … (refresh = true) in a function
> Tests ALTER … REFRESH in a function  <== this combination is not being
> testing ??
>

I am not sure whether there is much value in adding more to this set
of negative test cases unless it really covers a different code path
which I think won't happen if we add more tests here.

-- 
With Regards,
Amit Kapila.




Re: Single transaction in the tablesync worker?

2021-02-07 Thread Peter Smith
On Sat, Feb 6, 2021 at 6:30 PM osumi.takami...@fujitsu.com
 wrote:
>
> Hi
>
>
> On Friday, February 5, 2021 5:51 PM Amit Kapila  
> wrote:
> > On Fri, Feb 5, 2021 at 12:36 PM osumi.takami...@fujitsu.com
> >  wrote:
> > >
> > > We need to add some tests to prove the new checks of AlterSubscription()
> > work.
> > > I chose TAP tests as we need to set connect = true for the subscription.
> > > When it can contribute to the development, please utilize this.
> > > I used v28 to check my patch and works as we expect.
> > >
> >
> > Thanks for writing the tests but I don't understand why you need to set
> > connect = true for this test? I have tried below '... with connect = false' 
> > and it
> > seems to be working:
> > postgres=# CREATE SUBSCRIPTION mysub
> > postgres-#  CONNECTION 'host=localhost port=5432
> > dbname=postgres'
> > postgres-# PUBLICATION mypublication WITH (connect = false);
> > WARNING:  tables were not subscribed, you will have to run ALTER
> > SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables CREATE
> > SUBSCRIPTION postgres=# Begin; BEGIN postgres=*# Alter Subscription
> > mysub Refresh Publication;
> > ERROR:  ALTER SUBSCRIPTION ... REFRESH is not allowed for disabled
> > subscriptions
> >
> > So, if possible lets write this test in 
> > src/test/regress/sql/subscription.sql.
> OK. I changed the place to write the tests for those.
>
>
> > I have another idea for a test case: What if we write a test such that it 
> > fails PK
> > violation on copy and then drop the subscription. Then check there shouldn't
> > be any dangling slot on the publisher? This is similar to a test in
> > subscription/t/004_sync.pl, we can use some of that framework but have a
> > separate test for this.
> I've added this PK violation test to the attached tests.
> The patch works with v28 and made no failure during regression tests.
>

I checked this patch. It applied cleanly on top of V28, and all tests passed OK.

Here are two feedback comments.

1. For the regression test there is 2 x SQL and 1 x function test. I
thought to cover all the combinations there should be another function
test. e.g.
Tests ALTER … REFRESH
Tests ALTER …. (refresh = true)
Tests ALTER … (refresh = true) in a function
Tests ALTER … REFRESH in a function  <== this combination is not being
testing ??

2. For the 004 test case I know the test is needing some PK constraint violation
# Check if DROP SUBSCRIPTION cleans up slots on the publisher side
# when the subscriber is stuck on data copy for constraint

But it is not clear to me what was the exact cause of that PK
violation. I think you must be relying on data that is leftover from
some previous test case but I am not sure which one. Can you make the
comment more detailed to say *how* the PK violation is happening - e.g
something to say which rows, in which table, and inserted by who?

--
Kind Regards,
Peter Smith.
Fujitsu Australia




Re: Single transaction in the tablesync worker?

2021-02-07 Thread Peter Smith
On Sun, Feb 7, 2021 at 2:38 PM Peter Smith  wrote:
>
> On Sat, Feb 6, 2021 at 2:10 AM Petr Jelinek
>  wrote:
> >
> > Hi,
> >
> > Some minor comments about code:
> >
> > > + else if (res->status == WALRCV_ERROR && missing_ok)
> > > + {
> > > + /* WARNING. Error, but missing_ok = true. */
> > > + ereport(WARNING,
> >
> > I wonder if we need to add error code to the WalRcvExecResult and check
> > for the appropriate ones here. Because this can for example return error
> > because of timeout, not because slot is missing. Not sure if it matters
> > for current callers though (but then maybe don't call the param
> > missign_ok?).
>
> You are right. The way we are using this function has evolved beyond
> the original intention.
> Probably renaming the param to something like "error_ok" would be more
> appropriate now.
>

PSA a patch (apply on top of V28) to change the misleading param name.


Kind Regards,
Peter Smith.
Fujitsu Australia


v1-0001-ReplicationSlotDropAtPubNode-param.patch
Description: Binary data


Re: Single transaction in the tablesync worker?

2021-02-06 Thread Peter Smith
On Sat, Feb 6, 2021 at 2:10 AM Petr Jelinek
 wrote:
>
> Hi,
>
> Some minor comments about code:
>
> > + else if (res->status == WALRCV_ERROR && missing_ok)
> > + {
> > + /* WARNING. Error, but missing_ok = true. */
> > + ereport(WARNING,
>
> I wonder if we need to add error code to the WalRcvExecResult and check
> for the appropriate ones here. Because this can for example return error
> because of timeout, not because slot is missing. Not sure if it matters
> for current callers though (but then maybe don't call the param
> missign_ok?).

You are right. The way we are using this function has evolved beyond
the original intention.
Probably renaming the param to something like "error_ok" would be more
appropriate now.


Kind Regards,
Peter Smith.
Fujitsu Australia




Re: Single transaction in the tablesync worker?

2021-02-06 Thread Petr Jelinek



On 06/02/2021 06:07, Amit Kapila wrote:

On Sat, Feb 6, 2021 at 6:22 AM Peter Smith  wrote:

On Sat, Feb 6, 2021 at 2:10 AM Petr Jelinek
 wrote:

+ReplicationSlotNameForTablesync(Oid suboid, Oid relid, char 
syncslotname[NAMEDATALEN])
+{
+ if (syncslotname)
+ sprintf(syncslotname, "pg_%u_sync_%u", suboid, relid);
+ else
+ syncslotname = psprintf("pg_%u_sync_%u", suboid, relid);
+
+ return syncslotname;
+}

Given that we are now explicitly dropping slots, what happens here if we
have 2 different downstreams that happen to get same suboid and reloid,
will one of the drop the slot of the other one? Previously with the
cleanup being left to temp slot we'd at maximum got error when creating
it but with the new logic in LogicalRepSyncTableStart it feels like we
could get into situation where 2 downstreams are fighting over slot no?


I think so. See, if the alternative suggested below works or if you
have any other suggestions for the same?


The PG docs [1] says "there is only one copy of pg_subscription per
cluster, not one per database". IIUC that means it is not possible for
2 different subscriptions to have the same suboid.


I think he is talking about two different clusters having separate
subscriptions but point to the same publisher. In different clusters,
we can get the same subid/relid. I think we need a cluster-wide unique
identifier to distinguish among different subscribers. How about using
the system_identifier stored in the control file (we can use
GetSystemIdentifier to retrieve it).  I think one concern could be
that adding that to slot name could exceed the max length of slot
(NAMEDATALEN -1) but I don't think that is the case here
(pg_%u_sync_%u_UINT64_FORMAT (3 + 10 + 6 + 10 + 20 + '\0')). Note last
is system_identifier in this scheme.



Yep that's what I mean and system_identifier seems like a good choice to me.



Do you guys think that works or let me know if you have any other
better idea? Petr, is there a reason why such an identifier is not
considered originally, is there any risk in it?



Originally it was not considered likely because it's all based on 
pglogical/BDR work where ids are hashes of stuff that's unique across 
group of instances, not counter based like Oids in PostgreSQL and I 
simply didn't realize it could be a problem until reading this patch :)



--
Petr Jelinek





RE: Single transaction in the tablesync worker?

2021-02-05 Thread osumi.takami...@fujitsu.com
Hi


On Friday, February 5, 2021 5:51 PM Amit Kapila  wrote:
> On Fri, Feb 5, 2021 at 12:36 PM osumi.takami...@fujitsu.com
>  wrote:
> >
> > We need to add some tests to prove the new checks of AlterSubscription()
> work.
> > I chose TAP tests as we need to set connect = true for the subscription.
> > When it can contribute to the development, please utilize this.
> > I used v28 to check my patch and works as we expect.
> >
> 
> Thanks for writing the tests but I don't understand why you need to set
> connect = true for this test? I have tried below '... with connect = false' 
> and it
> seems to be working:
> postgres=# CREATE SUBSCRIPTION mysub
> postgres-#  CONNECTION 'host=localhost port=5432
> dbname=postgres'
> postgres-# PUBLICATION mypublication WITH (connect = false);
> WARNING:  tables were not subscribed, you will have to run ALTER
> SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables CREATE
> SUBSCRIPTION postgres=# Begin; BEGIN postgres=*# Alter Subscription
> mysub Refresh Publication;
> ERROR:  ALTER SUBSCRIPTION ... REFRESH is not allowed for disabled
> subscriptions
> 
> So, if possible lets write this test in src/test/regress/sql/subscription.sql.
OK. I changed the place to write the tests for those.

 
> I have another idea for a test case: What if we write a test such that it 
> fails PK
> violation on copy and then drop the subscription. Then check there shouldn't
> be any dangling slot on the publisher? This is similar to a test in
> subscription/t/004_sync.pl, we can use some of that framework but have a
> separate test for this.
I've added this PK violation test to the attached tests.
The patch works with v28 and made no failure during regression tests.

Best Regards,
Takamichi Osumi



refresh_and_pk_violation_testsets.patch
Description: refresh_and_pk_violation_testsets.patch


Re: Single transaction in the tablesync worker?

2021-02-05 Thread Amit Kapila
On Sat, Feb 6, 2021 at 6:22 AM Peter Smith  wrote:
>
> On Sat, Feb 6, 2021 at 2:10 AM Petr Jelinek
>  wrote:
> >
> > > +ReplicationSlotNameForTablesync(Oid suboid, Oid relid, char 
> > > syncslotname[NAMEDATALEN])
> > > +{
> > > + if (syncslotname)
> > > + sprintf(syncslotname, "pg_%u_sync_%u", suboid, relid);
> > > + else
> > > + syncslotname = psprintf("pg_%u_sync_%u", suboid, relid);
> > > +
> > > + return syncslotname;
> > > +}
> >
> > Given that we are now explicitly dropping slots, what happens here if we
> > have 2 different downstreams that happen to get same suboid and reloid,
> > will one of the drop the slot of the other one? Previously with the
> > cleanup being left to temp slot we'd at maximum got error when creating
> > it but with the new logic in LogicalRepSyncTableStart it feels like we
> > could get into situation where 2 downstreams are fighting over slot no?
> >

I think so. See, if the alternative suggested below works or if you
have any other suggestions for the same?

>
> The PG docs [1] says "there is only one copy of pg_subscription per
> cluster, not one per database". IIUC that means it is not possible for
> 2 different subscriptions to have the same suboid.
>

I think he is talking about two different clusters having separate
subscriptions but point to the same publisher. In different clusters,
we can get the same subid/relid. I think we need a cluster-wide unique
identifier to distinguish among different subscribers. How about using
the system_identifier stored in the control file (we can use
GetSystemIdentifier to retrieve it).  I think one concern could be
that adding that to slot name could exceed the max length of slot
(NAMEDATALEN -1) but I don't think that is the case here
(pg_%u_sync_%u_UINT64_FORMAT (3 + 10 + 6 + 10 + 20 + '\0')). Note last
is system_identifier in this scheme.

Do you guys think that works or let me know if you have any other
better idea? Petr, is there a reason why such an identifier is not
considered originally, is there any risk in it?

-- 
With Regards,
Amit Kapila.




Re: Single transaction in the tablesync worker?

2021-02-05 Thread Peter Smith
On Sat, Feb 6, 2021 at 2:10 AM Petr Jelinek
 wrote:
>
> > +ReplicationSlotNameForTablesync(Oid suboid, Oid relid, char 
> > syncslotname[NAMEDATALEN])
> > +{
> > + if (syncslotname)
> > + sprintf(syncslotname, "pg_%u_sync_%u", suboid, relid);
> > + else
> > + syncslotname = psprintf("pg_%u_sync_%u", suboid, relid);
> > +
> > + return syncslotname;
> > +}
>
> Given that we are now explicitly dropping slots, what happens here if we
> have 2 different downstreams that happen to get same suboid and reloid,
> will one of the drop the slot of the other one? Previously with the
> cleanup being left to temp slot we'd at maximum got error when creating
> it but with the new logic in LogicalRepSyncTableStart it feels like we
> could get into situation where 2 downstreams are fighting over slot no?
>

The PG docs [1] says "there is only one copy of pg_subscription per
cluster, not one per database". IIUC that means it is not possible for
2 different subscriptions to have the same suboid. And if the suboid
is globally unique then syncslotname name is also unique. Is that
understanding not correct?

-
[1] https://www.postgresql.org/docs/devel/catalog-pg-subscription.html

Kind Regards,
Peter Smith.
Fujitsu Australia




Re: Single transaction in the tablesync worker?

2021-02-05 Thread Petr Jelinek

Hi,

We had a bit high-level discussion about this patches with Amit 
off-list, so I decided to also take a look at the actual code.


My main concern originally was the potential for left-over slots on 
publisher, but I think the state now is relatively okay, with couple of 
corner cases that are documented and don't seem much worse than the main 
slot.


I wonder if we should mention the max_slot_wal_keep_size GUC in the 
table sync docs though.


Another thing that might need documentation is that the the visibility 
of changes done by table sync is not anymore isolated in that table 
contents will show intermediate progress to other backends, rather than 
switching from nothing to state consistent with rest of replication.



Some minor comments about code:


+   else if (res->status == WALRCV_ERROR && missing_ok)
+   {
+   /* WARNING. Error, but missing_ok = true. */
+   ereport(WARNING,


I wonder if we need to add error code to the WalRcvExecResult and check 
for the appropriate ones here. Because this can for example return error 
because of timeout, not because slot is missing. Not sure if it matters 
for current callers though (but then maybe don't call the param 
missign_ok?).




+ReplicationSlotNameForTablesync(Oid suboid, Oid relid, char 
syncslotname[NAMEDATALEN])
+{
+   if (syncslotname)
+   sprintf(syncslotname, "pg_%u_sync_%u", suboid, relid);
+   else
+   syncslotname = psprintf("pg_%u_sync_%u", suboid, relid);
+
+   return syncslotname;
+}


Given that we are now explicitly dropping slots, what happens here if we 
have 2 different downstreams that happen to get same suboid and reloid, 
will one of the drop the slot of the other one? Previously with the 
cleanup being left to temp slot we'd at maximum got error when creating 
it but with the new logic in LogicalRepSyncTableStart it feels like we 
could get into situation where 2 downstreams are fighting over slot no?



--
Petr




Re: Single transaction in the tablesync worker?

2021-02-05 Thread Ajin Cherian
I did some basic cross-version testing, publisher on PG13 and
subscriber on PG14 and publisher on PG14 and subscriber on PG13.
Did some basic operations, CREATE, ALTER and STOP subscriptions and it
seemed to work fine, no errors.

regards,
Ajin Cherian
Fujitsu Australia.




Re: Single transaction in the tablesync worker?

2021-02-05 Thread Amit Kapila
On Fri, Feb 5, 2021 at 12:36 PM osumi.takami...@fujitsu.com
 wrote:
>
> We need to add some tests to prove the new checks of AlterSubscription() work.
> I chose TAP tests as we need to set connect = true for the subscription.
> When it can contribute to the development, please utilize this.
> I used v28 to check my patch and works as we expect.
>

Thanks for writing the tests but I don't understand why you need to
set connect = true for this test? I have tried below '... with connect
= false' and it seems to be working:
postgres=# CREATE SUBSCRIPTION mysub
postgres-#  CONNECTION 'host=localhost port=5432 dbname=postgres'
postgres-# PUBLICATION mypublication WITH (connect = false);
WARNING:  tables were not subscribed, you will have to run ALTER
SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
CREATE SUBSCRIPTION
postgres=# Begin;
BEGIN
postgres=*# Alter Subscription mysub Refresh Publication;
ERROR:  ALTER SUBSCRIPTION ... REFRESH is not allowed for disabled subscriptions

So, if possible lets write this test in src/test/regress/sql/subscription.sql.

I have another idea for a test case: What if we write a test such that
it fails PK violation on copy and then drop the subscription. Then
check there shouldn't be any dangling slot on the publisher? This is
similar to a test in subscription/t/004_sync.pl, we can use some of
that framework but have a separate test for this.

-- 
With Regards,
Amit Kapila.




RE: Single transaction in the tablesync worker?

2021-02-04 Thread osumi.takami...@fujitsu.com
Hello



On Friday, February 5, 2021 2:23 PM Amit Kapila 
> On Fri, Feb 5, 2021 at 7:09 AM Peter Smith  wrote:
> >
> > On Thu, Feb 4, 2021 at 8:33 PM Amit Kapila 
> wrote:
> > >
> > ...
> >
> > > Thanks. I have fixed one of the issues reported by me earlier [1]
> > > wherein the tablesync worker can repeatedly fail if after dropping
> > > the slot there is an error while updating the SYNCDONE state in the
> > > database. I have moved the drop of the slot just before commit of
> > > the transaction where we are marking the state as SYNCDONE.
> > > Additionally, I have removed unnecessary includes in tablesync.c,
> > > updated the docs for Alter Subscription, and updated the comments at
> > > various places in the patch. I have also updated the commit message this
> time.
> > >
> >
> > Below are my feedback comments for V17 (nothing functional)
> >
> > ~~
> >
> > 1.
> > V27 Commit message:
> > For the initial table data synchronization in logical replication, we
> > use a single transaction to copy the entire table and then
> > synchronizes the position in the stream with the main apply worker.
> >
> > Typo:
> > "synchronizes" -> "synchronize"
> >
> 
> Fixed and added a note about Alter Sub .. Refresh .. command can't be
> executed in the transaction block.
Thank you for the updates.

We need to add some tests to prove the new checks of AlterSubscription() work. 
I chose TAP tests as we need to set connect = true for the subscription.
When it can contribute to the development, please utilize this.
I used v28 to check my patch and works as we expect.


Best Regards,
Takamichi Osumi


AlterSubscription_with_refresh_tests.patch
Description: AlterSubscription_with_refresh_tests.patch


Re: Single transaction in the tablesync worker?

2021-02-04 Thread Amit Kapila
On Fri, Feb 5, 2021 at 7:09 AM Peter Smith  wrote:
>
> On Thu, Feb 4, 2021 at 8:33 PM Amit Kapila  wrote:
> >
> ...
>
> > Thanks. I have fixed one of the issues reported by me earlier [1]
> > wherein the tablesync worker can repeatedly fail if after dropping the
> > slot there is an error while updating the SYNCDONE state in the
> > database. I have moved the drop of the slot just before commit of the
> > transaction where we are marking the state as SYNCDONE. Additionally,
> > I have removed unnecessary includes in tablesync.c, updated the docs
> > for Alter Subscription, and updated the comments at various places in
> > the patch. I have also updated the commit message this time.
> >
>
> Below are my feedback comments for V17 (nothing functional)
>
> ~~
>
> 1.
> V27 Commit message:
> For the initial table data synchronization in logical replication, we use
> a single transaction to copy the entire table and then synchronizes the
> position in the stream with the main apply worker.
>
> Typo:
> "synchronizes" -> "synchronize"
>

Fixed and added a note about Alter Sub .. Refresh .. command can't be
executed in the transaction block.

> ~~
>
> 2.
> @@ -48,6 +48,23 @@ ALTER SUBSCRIPTION  class="parameter">name RENAME TO <
> (Currently, all subscription owners must be superusers, so the owner 
> checks
> will be bypassed in practice.  But this might change in the future.)
>
> +
> +  
> +   When refreshing a publication we remove the relations that are no longer
> +   part of the publication and we also remove the tablesync slots if there 
> are
> +   any. It is necessary to remove tablesync slots so that the resources
> +   allocated for the subscription on the remote host are released. If due to
> +   network breakdown or some other error, we are not able to remove the 
> slots,
> +   we give WARNING and the user needs to manually remove such slots later as
> +   otherwise, they will continue to reserve WAL and might eventually cause
> +   the disk to fill up. See also  linkend="logical-replication-subscription-slot"/>.
> +  
>
> I think the content is good, but the 1st-person wording seemed strange.
> e.g.
> "we are not able to remove the slots, we give WARNING and the user needs..."
> Maybe it should be like:
> "... PostgreSQL is unable to remove the slots, so a WARNING is
> reported. The user needs... "
>

Changed as per suggestion with a minor tweak.

> ~~
>
> 3.
> @@ -566,107 +569,197 @@ AlterSubscription_refresh(Subscription *sub,
> bool copy_data)
> ...
> + * XXX If there is a network break down while dropping the
>
> "network break down" -> "network breakdown"
>
> ~~
>
> 4.
> @@ -872,7 +970,48 @@ LogicalRepSyncTableStart(XLogRecPtr *origin_startpos)
>   (errmsg("could not connect to the publisher: %s", err)));
> ...
> + * XXX We could also instead try to drop the slot, last time we failed
> + * but for that, we might need to clean up the copy state as it might
> + * be in the middle of fetching the rows. Also, if there is a network
> + * break down then it wouldn't have succeeded so trying it next time
> + * seems like a better bet.
>
> "network break down" -> "network breakdown"
>

Changed as per suggestion.

> ~~
>
> 5.
> @@ -269,26 +313,47 @@ invalidate_syncing_table_states(Datum arg, int
> cacheid, uint32 hashvalue)
> ...
> +
> + /*
> + * Cleanup the tablesync slot.
> + *
> + * This has to be done after updating the state because otherwise if
> + * there is an error while doing the database operations we won't be
> + * able to rollback dropped slot.
> + */
> + ReplicationSlotNameForTablesync(MyLogicalRepWorker->subid,
> + MyLogicalRepWorker->relid,
> + syncslotname);
> +
> + ReplicationSlotDropAtPubNode(wrconn, syncslotname, false /* missing_ok */);
> +
>
> Should this comment also describe why the missing_ok is false for this case?
>

Yeah that makes sense, so added a comment.

Additionally, I have changed the errorcode in RemoveSubscriptionRel,
moved the setup of origin before copy_table in
LogicalRepSyncTableStart to avoid doing copy again due to an error in
setting up origin. I have made a few comment changes as well.

-- 
With Regards,
Amit Kapila.


v28-0001-Allow-multiple-xacts-during-table-sync-in-logica.patch
Description: Binary data


Re: Single transaction in the tablesync worker?

2021-02-04 Thread Peter Smith
On Thu, Feb 4, 2021 at 8:33 PM Amit Kapila  wrote:
>
...

> Thanks. I have fixed one of the issues reported by me earlier [1]
> wherein the tablesync worker can repeatedly fail if after dropping the
> slot there is an error while updating the SYNCDONE state in the
> database. I have moved the drop of the slot just before commit of the
> transaction where we are marking the state as SYNCDONE. Additionally,
> I have removed unnecessary includes in tablesync.c, updated the docs
> for Alter Subscription, and updated the comments at various places in
> the patch. I have also updated the commit message this time.
>

Below are my feedback comments for V17 (nothing functional)

~~

1.
V27 Commit message:
For the initial table data synchronization in logical replication, we use
a single transaction to copy the entire table and then synchronizes the
position in the stream with the main apply worker.

Typo:
"synchronizes" -> "synchronize"

~~

2.
@@ -48,6 +48,23 @@ ALTER SUBSCRIPTION name RENAME TO <
(Currently, all subscription owners must be superusers, so the owner checks
will be bypassed in practice.  But this might change in the future.)
   
+
+  
+   When refreshing a publication we remove the relations that are no longer
+   part of the publication and we also remove the tablesync slots if there are
+   any. It is necessary to remove tablesync slots so that the resources
+   allocated for the subscription on the remote host are released. If due to
+   network breakdown or some other error, we are not able to remove the slots,
+   we give WARNING and the user needs to manually remove such slots later as
+   otherwise, they will continue to reserve WAL and might eventually cause
+   the disk to fill up. See also .
+  

I think the content is good, but the 1st-person wording seemed strange.
e.g.
"we are not able to remove the slots, we give WARNING and the user needs..."
Maybe it should be like:
"... PostgreSQL is unable to remove the slots, so a WARNING is
reported. The user needs... "

~~

3.
@@ -566,107 +569,197 @@ AlterSubscription_refresh(Subscription *sub,
bool copy_data)
...
+ * XXX If there is a network break down while dropping the

"network break down" -> "network breakdown"

~~

4.
@@ -872,7 +970,48 @@ LogicalRepSyncTableStart(XLogRecPtr *origin_startpos)
  (errmsg("could not connect to the publisher: %s", err)));
...
+ * XXX We could also instead try to drop the slot, last time we failed
+ * but for that, we might need to clean up the copy state as it might
+ * be in the middle of fetching the rows. Also, if there is a network
+ * break down then it wouldn't have succeeded so trying it next time
+ * seems like a better bet.

"network break down" -> "network breakdown"

~~

5.
@@ -269,26 +313,47 @@ invalidate_syncing_table_states(Datum arg, int
cacheid, uint32 hashvalue)
...
+
+ /*
+ * Cleanup the tablesync slot.
+ *
+ * This has to be done after updating the state because otherwise if
+ * there is an error while doing the database operations we won't be
+ * able to rollback dropped slot.
+ */
+ ReplicationSlotNameForTablesync(MyLogicalRepWorker->subid,
+ MyLogicalRepWorker->relid,
+ syncslotname);
+
+ ReplicationSlotDropAtPubNode(wrconn, syncslotname, false /* missing_ok */);
+

Should this comment also describe why the missing_ok is false for this case?


Kind Regards,
Peter Smith.
Fujitsu Australia




Re: Single transaction in the tablesync worker?

2021-02-04 Thread Amit Kapila
On Thu, Feb 4, 2021 at 9:55 AM Ajin Cherian  wrote:
>
> On Wed, Feb 3, 2021 at 11:38 PM Amit Kapila  wrote:
>
> > Thanks for the report. The problem here was that the error occurred
> > when we were trying to copy the large data. Now, before fetching the
> > entire data we issued a rollback that led to this problem. I think the
> > alternative here could be to first fetch the entire data when the
> > error occurred then issue the following commands. Instead, I have
> > modified the patch to perform 'drop_replication_slot' in the beginning
> > if the relstate is datasync.  Do let me know if you can think of a
> > better way to fix this?
>
> I have verified that the problem is not seen after this patch. I also
> agree with the approach taken for the fix,
>

Thanks. I have fixed one of the issues reported by me earlier [1]
wherein the tablesync worker can repeatedly fail if after dropping the
slot there is an error while updating the SYNCDONE state in the
database. I have moved the drop of the slot just before commit of the
transaction where we are marking the state as SYNCDONE. Additionally,
I have removed unnecessary includes in tablesync.c, updated the docs
for Alter Subscription, and updated the comments at various places in
the patch. I have also updated the commit message this time.

I am still not very happy with the way we handle concurrent drop
origins but probably that would be addressed by the other patch Peter
is working on [2].

[1] - 
https://www.postgresql.org/message-id/CAA4eK1JdWv84nMyCpTboBURjj70y3BfO1xdy8SYPRqNxtH7TEA%40mail.gmail.com
[2] - 
https://www.postgresql.org/message-id/CAHut%2BPsW6%2B7Ucb1sxjSNBaSYPGAVzQFbAEg4y1KsYQiGCnyGeQ%40mail.gmail.com

-- 
With Regards,
Amit Kapila.


v27-0001-Allow-multiple-xacts-during-table-sync-in-logica.patch
Description: Binary data


Re: Single transaction in the tablesync worker?

2021-02-03 Thread Ajin Cherian
On Wed, Feb 3, 2021 at 11:38 PM Amit Kapila  wrote:

> Thanks for the report. The problem here was that the error occurred
> when we were trying to copy the large data. Now, before fetching the
> entire data we issued a rollback that led to this problem. I think the
> alternative here could be to first fetch the entire data when the
> error occurred then issue the following commands. Instead, I have
> modified the patch to perform 'drop_replication_slot' in the beginning
> if the relstate is datasync.  Do let me know if you can think of a
> better way to fix this?

I have verified that the problem is not seen after this patch. I also
agree with the approach taken for the fix,

regards,
Ajin Cherian
Fujitsu Australia




Re: Single transaction in the tablesync worker?

2021-02-03 Thread Amit Kapila
On Wed, Feb 3, 2021 at 1:28 PM Ajin Cherian  wrote:
>
> On Tue, Feb 2, 2021 at 9:03 PM Ajin Cherian  wrote:
>
> > I am sorry, my above steps were not correct. I think the reason for
> > the failure I was seeing were some other steps I did prior to this. I
> > will recreate this and update you with the appropriate steps.
>
> The correct steps are as follows:
>
> Publisher:
>
> postgres=# CREATE TABLE tab_rep (a int primary key);
> CREATE TABLE
> postgres=# INSERT INTO tab_rep SELECT generate_series(1,100);
> INSERT 0 100
> postgres=# CREATE PUBLICATION tap_pub FOR ALL TABLES;
> CREATE PUBLICATION
>
> Subscriber:
> postgres=# CREATE TABLE tab_rep (a int primary key);
> CREATE TABLE
> postgres=# CREATE SUBSCRIPTION tap_sub CONNECTION 'host=localhost
> dbname=postgres port=6972' PUBLICATION tap_pub WITH (enabled = false);
> NOTICE:  created replication slot "tap_sub" on publisher
> CREATE SUBSCRIPTION
> postgres=# ALTER SUBSCRIPTION tap_sub enable;
> ALTER SUBSCRIPTION
>
> Allow the tablesync to complete and then drop the subscription, the
> table remains full and restarting the subscription should fail with a
> constraint violation during tablesync but it does not.
>
>
> Subscriber:
> postgres=# drop subscription tap_sub ;
> NOTICE:  dropped replication slot "tap_sub" on publisher
> DROP SUBSCRIPTION
> postgres=# CREATE SUBSCRIPTION tap_sub CONNECTION 'host=localhost
> dbname=postgres port=6972' PUBLICATION tap_pub WITH (enabled = false);
> NOTICE:  created replication slot "tap_sub" on publisher
> CREATE SUBSCRIPTION
> postgres=# ALTER SUBSCRIPTION tap_sub enable;
> ALTER SUBSCRIPTION
>
> This takes the subscriber into an error loop but no mention of what
> the error was:
>

Thanks for the report. The problem here was that the error occurred
when we were trying to copy the large data. Now, before fetching the
entire data we issued a rollback that led to this problem. I think the
alternative here could be to first fetch the entire data when the
error occurred then issue the following commands. Instead, I have
modified the patch to perform 'drop_replication_slot' in the beginning
if the relstate is datasync.  Do let me know if you can think of a
better way to fix this?

-- 
With Regards,
Amit Kapila.


v26-0001-Tablesync-Solution1.patch
Description: Binary data


Re: Single transaction in the tablesync worker?

2021-02-03 Thread Peter Smith
On Wed, Feb 3, 2021 at 2:51 PM Peter Smith  wrote:
>
> On Wed, Feb 3, 2021 at 1:34 PM Amit Kapila  wrote:
> >
> > On Wed, Feb 3, 2021 at 6:38 AM Peter Smith  wrote:
> > >
> > > On Wed, Feb 3, 2021 at 12:26 AM Amit Kapila  
> > > wrote:
> > > >
> > > > On Tue, Feb 2, 2021 at 3:31 PM Peter Smith  
> > > > wrote:
> > > > >
> > > > > After seeing Ajin's test [ac0202] which did a DROP TABLE, I have also
> > > > > tried a simple test where I do a DROP TABLE with very bad timing for
> > > > > the tablesync worker. It seems that doing this can cause the sync
> > > > > worker's MyLogicalRepWorker->relid to become invalid.
> > > > >
> > > >
> > > > I think this should be fixed by latest patch because I have disallowed
> > > > drop of a table when its synchronization is in progress. You can check
> > > > once and let me know if the issue still exists?
> > > >
> > >
> > > FYI - I confirmed that the problem scenario that I reported yesterday
> > > is no longer possible because now the V25 patch is disallowing the
> > > DROP TABLE while the tablesync is still running.
> > >
> >
> > Thanks for the confirmation. BTW, can you please check if we can
> > reproduce that problem without this patch? If so, we might want to
> > apply this fix irrespective of this patch. If not, why not?
> >
>
> Yes, this was an existing postgres bug. It is independent of the patch.
>
> I can reproduce exactly the same stacktrace using the HEAD src pulled @ 3/Feb.
>
> PSA my test logs showing the details.
>

Since this is an existing PG bug independent of this patch, I spawned
a new thread [ps0202] to deal with this problem.


[ps0202] 
https://www.postgresql.org/message-id/CAHut%2BPu7Z4a%3Domo%2BTvK5Gub2hxcJ7-3%2BBu1FO_%2B%2BfpFTW0oQfQ%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia




Re: Single transaction in the tablesync worker?

2021-02-02 Thread Ajin Cherian
On Tue, Feb 2, 2021 at 9:03 PM Ajin Cherian  wrote:

> I am sorry, my above steps were not correct. I think the reason for
> the failure I was seeing were some other steps I did prior to this. I
> will recreate this and update you with the appropriate steps.

The correct steps are as follows:

Publisher:

postgres=# CREATE TABLE tab_rep (a int primary key);
CREATE TABLE
postgres=# INSERT INTO tab_rep SELECT generate_series(1,100);
INSERT 0 100
postgres=# CREATE PUBLICATION tap_pub FOR ALL TABLES;
CREATE PUBLICATION

Subscriber:
postgres=# CREATE TABLE tab_rep (a int primary key);
CREATE TABLE
postgres=# CREATE SUBSCRIPTION tap_sub CONNECTION 'host=localhost
dbname=postgres port=6972' PUBLICATION tap_pub WITH (enabled = false);
NOTICE:  created replication slot "tap_sub" on publisher
CREATE SUBSCRIPTION
postgres=# ALTER SUBSCRIPTION tap_sub enable;
ALTER SUBSCRIPTION

Allow the tablesync to complete and then drop the subscription, the
table remains full and restarting the subscription should fail with a
constraint violation during tablesync but it does not.


Subscriber:
postgres=# drop subscription tap_sub ;
NOTICE:  dropped replication slot "tap_sub" on publisher
DROP SUBSCRIPTION
postgres=# CREATE SUBSCRIPTION tap_sub CONNECTION 'host=localhost
dbname=postgres port=6972' PUBLICATION tap_pub WITH (enabled = false);
NOTICE:  created replication slot "tap_sub" on publisher
CREATE SUBSCRIPTION
postgres=# ALTER SUBSCRIPTION tap_sub enable;
ALTER SUBSCRIPTION

This takes the subscriber into an error loop but no mention of what
the error was:

2021-02-02 05:01:34.698 EST [1549] LOG:  logical replication table
synchronization worker for subscription "tap_sub", table "tab_rep" has
started
2021-02-02 05:01:34.739 EST [1549] ERROR:  table copy could not
rollback transaction on publisher
2021-02-02 05:01:34.739 EST [1549] DETAIL:  The error was: another
command is already in progress
2021-02-02 05:01:34.740 EST [8028] LOG:  background worker "logical
replication worker" (PID 1549) exited with exit code 1
2021-02-02 05:01:40.107 EST [1711] LOG:  logical replication table
synchronization worker for subscription "tap_sub", table "tab_rep" has
started
2021-02-02 05:01:40.121 EST [1711] ERROR:  could not create
replication slot "pg_16479_sync_16435": ERROR:  replication slot
"pg_16479_sync_16435" already exists
2021-02-02 05:01:40.121 EST [8028] LOG:  background worker "logical
replication worker" (PID 1711) exited with exit code 1
2021-02-02 05:01:45.140 EST [1891] LOG:  logical replication table
synchronization worker for subscription "tap_sub", table "tab_rep" has
started

regards,
Ajin Cherian
Fujitsu Australia




Re: Single transaction in the tablesync worker?

2021-02-02 Thread Peter Smith
On Wed, Feb 3, 2021 at 1:34 PM Amit Kapila  wrote:
>
> On Wed, Feb 3, 2021 at 6:38 AM Peter Smith  wrote:
> >
> > On Wed, Feb 3, 2021 at 12:26 AM Amit Kapila  wrote:
> > >
> > > On Tue, Feb 2, 2021 at 3:31 PM Peter Smith  wrote:
> > > >
> > > > After seeing Ajin's test [ac0202] which did a DROP TABLE, I have also
> > > > tried a simple test where I do a DROP TABLE with very bad timing for
> > > > the tablesync worker. It seems that doing this can cause the sync
> > > > worker's MyLogicalRepWorker->relid to become invalid.
> > > >
> > >
> > > I think this should be fixed by latest patch because I have disallowed
> > > drop of a table when its synchronization is in progress. You can check
> > > once and let me know if the issue still exists?
> > >
> >
> > FYI - I confirmed that the problem scenario that I reported yesterday
> > is no longer possible because now the V25 patch is disallowing the
> > DROP TABLE while the tablesync is still running.
> >
>
> Thanks for the confirmation. BTW, can you please check if we can
> reproduce that problem without this patch? If so, we might want to
> apply this fix irrespective of this patch. If not, why not?
>

Yes, this was an existing postgres bug. It is independent of the patch.

I can reproduce exactly the same stacktrace using the HEAD src pulled @ 3/Feb.

PSA my test logs showing the details.


Kind Regards,
Peter Smith.
Fujitsu Australia
Test Scenario:

1. INSERT data so tablesync should copy something
2. While paused in LogicalRepSyncTableStart do a DROP TABLE to rip the rug out 
from under the sync worker!
3. Continue the sync worker see what happens

All code is from PG HEAD (3/Feb) except the "!!>>" added to allow me to attech 
to debugger.

==

##
## Insert data
##

[postgres@CentOS7-x64 ~]$ psql -d test_pub -c "INSERT INTO test_tab VALUES(1, 
'foo');"
INSERT 0 1

##
## SUBSCRIBE and continue to breakpoint at top of tablesync function 
LogicalRepSyncTableStart
##

[postgres@CentOS7-x64 ~]$ psql -d test_sub -p 54321 -c "CREATE SUBSCRIPTION 
tap_sub CONNECTION 'host=localhost dbname=test_pub application_name=tap_sub' 
PUBLICATION tap_pub;"
2021-02-03 13:42:20.031 AEDT [9555] LOG:  logical decoding found consistent 
point at 0/16605E8
2021-02-03 13:42:20.031 AEDT [9555] DETAIL:  There are no running transactions.
2021-02-03 13:42:20.031 AEDT [9555] STATEMENT:  CREATE_REPLICATION_SLOT 
"tap_sub" LOGICAL pgoutput NOEXPORT_SNAPSHOT
NOTICE:  created replication slot "tap_sub" on publisher
CREATE SUBSCRIPTION
2021-02-03 13:42:20.043 AEDT [9556] LOG:  logical replication apply worker for 
subscription "tap_sub" has started
2021-02-03 13:42:20.043 AEDT [9556] LOG:  !!>> The apply worker process has PID 
= 9556
[postgres@CentOS7-x64 ~]$ 2021-02-03 13:42:20.052 AEDT [9562] LOG:  starting 
logical decoding for slot "tap_sub"
2021-02-03 13:42:20.052 AEDT [9562] DETAIL:  Streaming transactions committing 
after 0/1660620, reading WAL from 0/16605E8.
2021-02-03 13:42:20.052 AEDT [9562] STATEMENT:  START_REPLICATION SLOT 
"tap_sub" LOGICAL 0/0 (proto_version '2', publication_names '"tap_pub"')
2021-02-03 13:42:20.052 AEDT [9562] LOG:  logical decoding found consistent 
point at 0/16605E8
2021-02-03 13:42:20.052 AEDT [9562] DETAIL:  There are no running transactions.
2021-02-03 13:42:20.052 AEDT [9562] STATEMENT:  START_REPLICATION SLOT 
"tap_sub" LOGICAL 0/0 (proto_version '2', publication_names '"tap_pub"')
2021-02-03 13:42:20.056 AEDT [9564] LOG:  logical replication table 
synchronization worker for subscription "tap_sub", table "test_tab" has started
2021-02-03 13:42:20.056 AEDT [9564] LOG:  !!>> The tablesync worker process has 
PID = 9564
2021-02-03 13:42:20.056 AEDT [9564] LOG:  !!>>


Sleeping 30 secs. For debugging, attach to process 9564 now!

## PID 9564 is sync worker
## PID 9556 is apply worker

##
## While paused in debugger for the tablesync worker do DROP TABLE on 
subscriber.
## Note we have not done any ALTER SUBSCRIPTION.
##

psql -d test_sub -p 54321 -c "DROP TABLE test_tab;"
DROP TABLE

## Following stacktrace LOG happens

[postgres@CentOS7-x64 ~]$ TRAP: FailedAssertion("strvalue != NULL", File: 
"snprintf.c", Line: 442, PID: 9564)
postgres: logical replication worker for subscription 16407 sync 16385 
(ExceptionalCondition+0xb9)[0xad8f4b]
postgres: logical replication worker for subscription 16407 sync 16385 
[0xb4c6ee]
postgres: logical replication worker for subscription 16407 sync 16385 
(pg_vsnprintf+0x7c)[0xb4c072]
postgres: logical replication worker for subscription 16407 sync 16385 
(pvsnprintf+0x30)[0xb3f26f]
postgres: logical replication worker for subscription 16407 sync 16385 
(appendStringInfoVA+0x80)[0xb40cb6]
postgres: logical replication worker for subscription 16407 sync 16385 
(errmsg+0x183)[0xad9d33]
postgres: logical replication worker for subscription 16407 sync 16385 
[0x8c52f3]
postgres: logical replication worker for subscription 16407 sync 16385 

Re: Single transaction in the tablesync worker?

2021-02-02 Thread Amit Kapila
On Wed, Feb 3, 2021 at 6:38 AM Peter Smith  wrote:
>
> On Wed, Feb 3, 2021 at 12:26 AM Amit Kapila  wrote:
> >
> > On Tue, Feb 2, 2021 at 3:31 PM Peter Smith  wrote:
> > >
> > > After seeing Ajin's test [ac0202] which did a DROP TABLE, I have also
> > > tried a simple test where I do a DROP TABLE with very bad timing for
> > > the tablesync worker. It seems that doing this can cause the sync
> > > worker's MyLogicalRepWorker->relid to become invalid.
> > >
> >
> > I think this should be fixed by latest patch because I have disallowed
> > drop of a table when its synchronization is in progress. You can check
> > once and let me know if the issue still exists?
> >
>
> FYI - I confirmed that the problem scenario that I reported yesterday
> is no longer possible because now the V25 patch is disallowing the
> DROP TABLE while the tablesync is still running.
>

Thanks for the confirmation. BTW, can you please check if we can
reproduce that problem without this patch? If so, we might want to
apply this fix irrespective of this patch. If not, why not?

-- 
With Regards,
Amit Kapila.




Re: Single transaction in the tablesync worker?

2021-02-02 Thread Ajin Cherian
On Wed, Feb 3, 2021 at 12:24 AM Amit Kapila  wrote:

> The problem here is that we are allowing to drop the table when table
> synchronization is still in progress and then we don't have any way to
> know the corresponding slot or origin. I think we can try to drop the
> slot and origin as well but that is not a good idea because slots once
> dropped won't be rolled back. So, I have added a fix to disallow the
> drop of the table when table synchronization is still in progress.
> Apart from that, I have fixed comments raised by Peter as discussed
> above and made some additional changes in comments, code (code changes
> are cosmetic), and docs.
>
> Let me know if the issue reported is fixed or not?

Yes, the issue is fixed, now the table drop results in an error.

postgres=# drop table tab_rep ;
ERROR:  could not drop relation mapping for subscription "tap_sub"
DETAIL:  Table synchronization for relation "tab_rep" is in progress
and is in state "f".
HINT:  Use ALTER SUBSCRIPTION ... ENABLE to enable subscription if not
already enabled or use DROP SUBSCRIPTION ... to drop the subscription.

regards,
Ajin Cherian
Fujitsu Australia




Re: Single transaction in the tablesync worker?

2021-02-02 Thread Peter Smith
On Wed, Feb 3, 2021 at 12:26 AM Amit Kapila  wrote:
>
> On Tue, Feb 2, 2021 at 3:31 PM Peter Smith  wrote:
> >
> > After seeing Ajin's test [ac0202] which did a DROP TABLE, I have also
> > tried a simple test where I do a DROP TABLE with very bad timing for
> > the tablesync worker. It seems that doing this can cause the sync
> > worker's MyLogicalRepWorker->relid to become invalid.
> >
>
> I think this should be fixed by latest patch because I have disallowed
> drop of a table when its synchronization is in progress. You can check
> once and let me know if the issue still exists?
>

FYI - I confirmed that the problem scenario that I reported yesterday
is no longer possible because now the V25 patch is disallowing the
DROP TABLE while the tablesync is still running.

PSA my test logs showing it is now working as expected.


Kind Regards,
Peter Smith.
Fujitsu Australia
Test Scenario

1. INSERT data so tablesync should copy something
2. While paused in LogicalRepSyncTableStart do a DROP TABLE to rip the rug out 
from under the sync worker!
3. Continue the sync worker see what happens

==

##
## Insert data
##

[postgres@CentOS7-x64 ~]$ psql -d test_pub -c "INSERT INTO test_tab VALUES(1, 
'foo');"
INSERT 0 1


##
## SUBSCRIBE and continue to breakpoint at top of tablesync function 
LogicalRepSyncTableStart
##

[postgres@CentOS7-x64 ~]$ psql -d test_sub -p 54321 -c "CREATE SUBSCRIPTION 
tap_sub CONNECTION 'host=localhost dbname=test_pub application_name=tap_sub' 
PUBLICATION tap_pub;"
2021-02-03 11:46:07.207 AEDT [8298] LOG:  logical decoding found consistent 
point at 0/1654698
2021-02-03 11:46:07.207 AEDT [8298] DETAIL:  There are no running transactions.
2021-02-03 11:46:07.207 AEDT [8298] STATEMENT:  CREATE_REPLICATION_SLOT 
"tap_sub" LOGICAL pgoutput NOEXPORT_SNAPSHOT
NOTICE:  created replication slot "tap_sub" on publisher
CREATE SUBSCRIPTION
2021-02-03 11:46:07.218 AEDT [8302] LOG:  logical replication apply worker for 
subscription "tap_sub" has started
2021-02-03 11:46:07.218 AEDT [8302] LOG:  !!>> The apply worker process has PID 
= 8302
[postgres@CentOS7-x64 ~]$ 2021-02-03 11:46:07.224 AEDT [8306] LOG:  starting 
logical decoding for slot "tap_sub"
2021-02-03 11:46:07.224 AEDT [8306] DETAIL:  Streaming transactions committing 
after 0/16546D0, reading WAL from 0/1654698.
2021-02-03 11:46:07.224 AEDT [8306] STATEMENT:  START_REPLICATION SLOT 
"tap_sub" LOGICAL 0/0 (proto_version '2', publication_names '"tap_pub"')
2021-02-03 11:46:07.225 AEDT [8306] LOG:  logical decoding found consistent 
point at 0/1654698
2021-02-03 11:46:07.225 AEDT [8306] DETAIL:  There are no running transactions.
2021-02-03 11:46:07.225 AEDT [8306] STATEMENT:  START_REPLICATION SLOT 
"tap_sub" LOGICAL 0/0 (proto_version '2', publication_names '"tap_pub"')
2021-02-03 11:46:07.225 AEDT [8302] LOG:  !!>> apply worker: LogicalRepApplyLoop
2021-02-03 11:46:07.225 AEDT [8302] LOG:  !!>> apply worker: called 
process_syncing_tables
2021-02-03 11:46:07.229 AEDT [8307] LOG:  logical replication table 
synchronization worker for subscription "tap_sub", table "test_tab" has started
2021-02-03 11:46:07.229 AEDT [8307] LOG:  !!>> The tablesync worker process has 
PID = 8307
2021-02-03 11:46:07.229 AEDT [8307] LOG:  !!>>


Sleeping 30 secs. For debugging, attach to process 8307 now!

## PID 8302 is apply worker
## PID 8307 is tablesync worker

2021-02-03 11:46:08.241 AEDT [8302] LOG:  !!>> apply worker: LogicalRepApplyLoop
2021-02-03 11:46:08.241 AEDT [8302] LOG:  !!>> apply worker: called 
process_syncing_tables
2021-02-03 11:46:09.253 AEDT [8302] LOG:  !!>> apply worker: LogicalRepApplyLoop
2021-02-03 11:46:09.253 AEDT [8302] LOG:  !!>> apply worker: called 
process_syncing_tables
2021-02-03 11:46:10.255 AEDT [8302] LOG:  !!>> apply worker: LogicalRepApplyLoop
2021-02-03 11:46:10.255 AEDT [8302] LOG:  !!>> apply worker: called 
process_syncing_tables
2021-02-03 11:46:11.258 AEDT [8302] LOG:  !!>> apply worker: LogicalRepApplyLoop
2021-02-03 11:46:11.258 AEDT [8302] LOG:  !!>> apply worker: called 
process_syncing_tables
2021-02-03 11:46:12.263 AEDT [8302] LOG:  !!>> apply worker: LogicalRepApplyLoop
2021-02-03 11:46:12.263 AEDT [8302] LOG:  !!>> apply worker: called 
process_syncing_tables
2021-02-03 11:46:13.265 AEDT [8302] LOG:  !!>> apply worker: LogicalRepApplyLoop
2021-02-03 11:46:13.265 AEDT [8302] LOG:  !!>> apply worker: called 
process_syncing_tables
2021-02-03 11:46:14.275 AEDT [8302] LOG:  !!>> apply worker: LogicalRepApplyLoop
2021-02-03 11:46:14.275 AEDT [8302] LOG:  !!>> apply worker: called 
process_syncing_tables
2021-02-03 11:46:15.280 AEDT [8302] LOG:  !!>> apply worker: LogicalRepApplyLoop
2021-02-03 11:46:15.280 AEDT [8302] LOG:  !!>> apply worker: called 
process_syncing_tables
2021-02-03 11:46:16.282 AEDT [8302] LOG:  !!>> apply worker: LogicalRepApplyLoop
2021-02-03 11:46:16.282 AEDT [8302] LOG:  !!>> apply worker: called 
process_syncing_tables
2021-02-03 

Re: Single transaction in the tablesync worker?

2021-02-02 Thread Amit Kapila
On Tue, Feb 2, 2021 at 3:31 PM Peter Smith  wrote:
>
> After seeing Ajin's test [ac0202] which did a DROP TABLE, I have also
> tried a simple test where I do a DROP TABLE with very bad timing for
> the tablesync worker. It seems that doing this can cause the sync
> worker's MyLogicalRepWorker->relid to become invalid.
>

I think this should be fixed by latest patch because I have disallowed
drop of a table when its synchronization is in progress. You can check
once and let me know if the issue still exists?

-- 
With Regards,
Amit Kapila.




Re: Single transaction in the tablesync worker?

2021-02-02 Thread Amit Kapila
On Tue, Feb 2, 2021 at 11:35 AM Ajin Cherian  wrote:
>
> Another failure I see in my testing
>

The problem here is that we are allowing to drop the table when table
synchronization is still in progress and then we don't have any way to
know the corresponding slot or origin. I think we can try to drop the
slot and origin as well but that is not a good idea because slots once
dropped won't be rolled back. So, I have added a fix to disallow the
drop of the table when table synchronization is still in progress.
Apart from that, I have fixed comments raised by Peter as discussed
above and made some additional changes in comments, code (code changes
are cosmetic), and docs.

Let me know if the issue reported is fixed or not?

-- 
With Regards,
Amit Kapila.


v25-0001-Tablesync-Solution1.patch
Description: Binary data


Re: Single transaction in the tablesync worker?

2021-02-02 Thread Ajin Cherian
On Tue, Feb 2, 2021 at 7:40 PM Amit Kapila  wrote:
>
> On Tue, Feb 2, 2021 at 10:34 AM Ajin Cherian  wrote:
> >
> > On Mon, Feb 1, 2021 at 11:26 PM Amit Kapila  wrote:
> > > I have updated the patch to display WARNING for each of the tablesync
> > > slots during DropSubscription. As discussed, I have moved the drop
> > > slot related code towards the end in AlterSubscription_refresh. Apart
> > > from this, I have fixed one more issue in tablesync code where in
> > > after catching the exception we were not clearing the transaction
> > > state on the publisher, see changes in LogicalRepSyncTableStart. I
> > > have also fixed other comments raised by you. Additionally, I have
> > > removed the test because it was creating the same name slot as the
> > > tablesync worker and tablesync worker removed the same due to new
> > > logic in LogicalRepSyncStart. Earlier, it was not failing because of
> > > the bug in that code which I have fixed in the attached.
> > >
> >
> > I was testing this patch. I had a table on the subscriber which had a
> > row that would cause a PK constraint
> > violation during the table copy. This is resulting in the subscriber
> > trying to rollback the table copy and failing.
> >
>
> I am not getting this error. I have tried by below test:

I am sorry, my above steps were not correct. I think the reason for
the failure I was seeing were some other steps I did prior to this. I
will recreate this and update you with the appropriate steps.

regards,
Ajin Cherian
Fujitsu Australia




Re: Single transaction in the tablesync worker?

2021-02-02 Thread Peter Smith
After seeing Ajin's test [ac0202] which did a DROP TABLE, I have also
tried a simple test where I do a DROP TABLE with very bad timing for
the tablesync worker. It seems that doing this can cause the sync
worker's MyLogicalRepWorker->relid to become invalid.

In my test this caused a stack trace within some logging, but I
imagine other bad things can happen if the tablesync worker can be
executed with an invalid relid.

Possibly this is an existing PG bug which has just never been seen
before; The ereport which has failed here is not new code.

PSA the log for the test steps and the stack trace details.


[ac0202] 
https://www.postgresql.org/message-id/CAFPTHDYzjaNfzsFHpER9idAPB8v5j%3DSUbWL0AKj5iVy0BKbTpg%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia
Test Scenario

1. INSERT data so tablesync should copy something
2. While paused in LogicalRepSyncTableStart do a DROP TABLE to rip the rug out 
from under the sync worker!
3. Continue the sync worker see what happens

==

##
## Insert data
##

[postgres@CentOS7-x64 ~]$ psql -d test_pub -c "INSERT INTO test_tab VALUES(1, 
'foo');"
INSERT 0 1

##
## SUBSCRIBE and continue to breakpoint at top of tablesync function 
LogicalRepSyncTableStart
##

[postgres@CentOS7-x64 ~]$ psql -d test_sub -p 54321 -c "CREATE SUBSCRIPTION 
tap_sub CONNECTION 'host=localhost dbname=test_pub application_name=tap_sub' 
PUBLICATION tap_pub;"
2021-02-02 19:29:16.578 AEDT [26402] LOG:  logical decoding found consistent 
point at 0/165F800
2021-02-02 19:29:16.578 AEDT [26402] DETAIL:  There are no running transactions.
2021-02-02 19:29:16.578 AEDT [26402] STATEMENT:  CREATE_REPLICATION_SLOT 
"tap_sub" LOGICAL pgoutput NOEXPORT_SNAPSHOT
NOTICE:  created replication slot "tap_sub" on publisher
CREATE SUBSCRIPTION
2021-02-02 19:29:16.587 AEDT [26405] LOG:  logical replication apply worker for 
subscription "tap_sub" has started
2021-02-02 19:29:16.587 AEDT [26405] LOG:  !!>> The apply worker process has 
PID = 26405
2021-02-02 19:29:16.597 AEDT [26411] LOG:  starting logical decoding for slot 
"tap_sub"
2021-02-02 19:29:16.597 AEDT [26411] DETAIL:  Streaming transactions committing 
after 0/165F838, reading WAL from 0/165F800.
2021-02-02 19:29:16.597 AEDT [26411] STATEMENT:  START_REPLICATION SLOT 
"tap_sub" LOGICAL 0/0 (proto_version '2', publication_names '"tap_pub"')
2021-02-02 19:29:16.598 AEDT [26411] LOG:  logical decoding found consistent 
point at 0/165F800
2021-02-02 19:29:16.598 AEDT [26411] DETAIL:  There are no running transactions.
2021-02-02 19:29:16.598 AEDT [26411] STATEMENT:  START_REPLICATION SLOT 
"tap_sub" LOGICAL 0/0 (proto_version '2', publication_names '"tap_pub"')
2021-02-02 19:29:16.598 AEDT [26405] LOG:  !!>> apply worker: 
LogicalRepApplyLoop
2021-02-02 19:29:16.598 AEDT [26405] LOG:  !!>> apply worker: called 
process_syncing_tables
[postgres@CentOS7-x64 ~]$ 2021-02-02 19:29:16.602 AEDT [26415] LOG:  logical 
replication table synchronization worker for subscription "tap_sub", table 
"test_tab" has started
2021-02-02 19:29:16.602 AEDT [26415] LOG:  !!>> The tablesync worker process 
has PID = 26415
2021-02-02 19:29:16.602 AEDT [26415] LOG:  !!>>


Sleeping 30 secs. For debugging, attach to process 26415 now!



2021-02-02 19:29:17.610 AEDT [26405] LOG:  !!>> apply worker: 
LogicalRepApplyLoop
2021-02-02 19:29:17.610 AEDT [26405] LOG:  !!>> apply worker: called 
process_syncing_tables
2021-02-02 19:29:18.611 AEDT [26405] LOG:  !!>> apply worker: 
LogicalRepApplyLoop
2021-02-02 19:29:18.611 AEDT [26405] LOG:  !!>> apply worker: called 
process_syncing_tables
2021-02-02 19:29:19.612 AEDT [26405] LOG:  !!>> apply worker: 
LogicalRepApplyLoop
2021-02-02 19:29:19.612 AEDT [26405] LOG:  !!>> apply worker: called 
process_syncing_tables
2021-02-02 19:29:20.613 AEDT [26405] LOG:  !!>> apply worker: 
LogicalRepApplyLoop
2021-02-02 19:29:20.613 AEDT [26405] LOG:  !!>> apply worker: called 
process_syncing_tables
2021-02-02 19:29:21.614 AEDT [26405] LOG:  !!>> apply worker: 
LogicalRepApplyLoop
2021-02-02 19:29:21.614 AEDT [26405] LOG:  !!>> apply worker: called 
process_syncing_tables
2021-02-02 19:29:22.615 AEDT [26405] LOG:  !!>> apply worker: 
LogicalRepApplyLoop
2021-02-02 19:29:22.615 AEDT [26405] LOG:  !!>> apply worker: called 
process_syncing_tables
2021-02-02 19:29:23.615 AEDT [26405] LOG:  !!>> apply worker: 
LogicalRepApplyLoop
2021-02-02 19:29:23.615 AEDT [26405] LOG:  !!>> apply worker: called 
process_syncing_tables
2021-02-02 19:29:24.658 AEDT [26405] LOG:  !!>> apply worker: 
LogicalRepApplyLoop
2021-02-02 19:29:24.658 AEDT [26405] LOG:  !!>> apply worker: called 
process_syncing_tables
2021-02-02 19:29:25.661 AEDT [26405] LOG:  !!>> apply worker: 
LogicalRepApplyLoop
2021-02-02 19:29:25.661 AEDT [26405] LOG:  !!>> apply worker: called 
process_syncing_tables
2021-02-02 19:29:26.662 AEDT [26405] LOG:  !!>> apply worker: 
LogicalRepApplyLoop
2021-02-02 19:29:26.662 AEDT 

Re: Single transaction in the tablesync worker?

2021-02-02 Thread Amit Kapila
On Tue, Feb 2, 2021 at 10:34 AM Ajin Cherian  wrote:
>
> On Mon, Feb 1, 2021 at 11:26 PM Amit Kapila  wrote:
> > I have updated the patch to display WARNING for each of the tablesync
> > slots during DropSubscription. As discussed, I have moved the drop
> > slot related code towards the end in AlterSubscription_refresh. Apart
> > from this, I have fixed one more issue in tablesync code where in
> > after catching the exception we were not clearing the transaction
> > state on the publisher, see changes in LogicalRepSyncTableStart. I
> > have also fixed other comments raised by you. Additionally, I have
> > removed the test because it was creating the same name slot as the
> > tablesync worker and tablesync worker removed the same due to new
> > logic in LogicalRepSyncStart. Earlier, it was not failing because of
> > the bug in that code which I have fixed in the attached.
> >
>
> I was testing this patch. I had a table on the subscriber which had a
> row that would cause a PK constraint
> violation during the table copy. This is resulting in the subscriber
> trying to rollback the table copy and failing.
>

I am not getting this error. I have tried by below test:
Publisher
===
CREATE TABLE mytbl1(id SERIAL PRIMARY KEY, somedata int, text varchar(120));

BEGIN;
INSERT INTO mytbl1(somedata, text) VALUES (1, 1);
INSERT INTO mytbl1(somedata, text) VALUES (1, 2);
COMMIT;

CREATE PUBLICATION mypublication FOR TABLE mytbl1;

Subscriber
=
CREATE TABLE mytbl1(id SERIAL PRIMARY KEY, somedata int, text varchar(120));

BEGIN;
INSERT INTO mytbl1(somedata, text) VALUES (1, 1);
INSERT INTO mytbl1(somedata, text) VALUES (1, 2);
COMMIT;

CREATE SUBSCRIPTION mysub
 CONNECTION 'host=localhost port=5432 dbname=postgres'
PUBLICATION mypublication;

It generates the PK violation the first time and then I removed the
conflicting rows in the subscriber and it passed. See logs below.

2021-02-02 13:51:34.316 IST [20796] LOG:  logical replication table
synchronization worker for subscription "mysub", table "mytbl1" has
started
2021-02-02 13:52:43.625 IST [20796] ERROR:  duplicate key value
violates unique constraint "mytbl1_pkey"
2021-02-02 13:52:43.625 IST [20796] DETAIL:  Key (id)=(1) already exists.
2021-02-02 13:52:43.625 IST [20796] CONTEXT:  COPY mytbl1, line 1
2021-02-02 13:52:43.695 IST [27840] LOG:  background worker "logical
replication worker" (PID 20796) exited with exit code 1
2021-02-02 13:52:43.884 IST [6260] LOG:  logical replication table
synchronization worker for subscription "mysub", table "mytbl1" has
started
2021-02-02 13:53:54.680 IST [6260] LOG:  logical replication table
synchronization worker for subscription "mysub", table "mytbl1" has
finished

Also, a similar test exists in 0004_sync.pl, is that also failing for
you? Can you please provide detailed steps that led to this failure?

>
> And one more thing I see is that now we error out in PG_CATCH() in
> LogicalRepSyncTableStart() with the above error and as a result, the
> tablesync slot is not dropped. Hence causing the slot create to fail
> in the next restart.
> I think this can be avoided. We could either attempt a rollback only
> on specific failures and drop slot prior to erroring out.
>

Hmm, we have to first rollback before attempting any other operation
because the transaction on the publisher is in an errored state.


-- 
With Regards,
Amit Kapila.




Re: Single transaction in the tablesync worker?

2021-02-01 Thread Peter Smith
On Mon, Feb 1, 2021 at 11:26 PM Amit Kapila  wrote:
>
> I have updated the patch to display WARNING for each of the tablesync
> slots during DropSubscription. As discussed, I have moved the drop
> slot related code towards the end in AlterSubscription_refresh. Apart
> from this, I have fixed one more issue in tablesync code where in
> after catching the exception we were not clearing the transaction
> state on the publisher, see changes in LogicalRepSyncTableStart.
...

I know that in another email [ac0202] Ajin has reported some problem
he found related to this new (LogicalRepSyncTableStart PG_CATCH) code
for some different use-case, but for my test scenario of a "broken
connection during a table copy" the code did appear to be working
properly.

PSA detailed logs which show the test steps and output for this
""broken connection during a table copy" scenario.


[ac0202] 
https://www.postgresql.org/message-id/CAFPTHDaZw5o%2BwMbv3aveOzuLyz_rqZebXAj59rDKTJbwXFPYgw%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia
Test Scenario 

=
OBJECTIVE
- Break the connection during table copy (after slot is created)
- Later re-establish the connection
- Check tablesync is able to recover and finish normally.
- Check no slots are left dangling.

=
STEPS
1.  Create PUBLISHER
2.  Insert some initial table data
3.  Create SUBSCRIBER
4.  Cause a network connection failure DURING table copy but AFTER the slot 
is created
4a. Pause in debugger AFTER slot created but before the copy
4b. Make the connect fail by stopping the PUBLISHER Postgres 
instance at this time
5.  Let the tablesync continue.
5a. The tablesync will re-launch but I think it will continue to 
fail and and fail and fail because cannot create slot which the connection is 
broken
6.  Then fix the connection by starting the PUBLISHER Postgres instance 
again
6a. The tablesync will attempt to create the slot will ERROR 
(already exists) at this time but then the PG_CATCH will drop it and the 
tablesync will be launched yet again…
7.  The tablesync should recover
7a. This time when the tablesync is relaunched there is no “already 
exists” error so it should be able to recover OK.
8.  The tablesync worker should finish normally
9.  The replicated data should be visible in the subscribed table
10. There should be no dangling slots (e.g. check for warnings during DROP 
SUBSCRIPTION)


=

##
## 2. Insert a record at publisher
##

[postgres@CentOS7-x64 ~]$ psql -d test_pub -c "INSERT INTO test_tab VALUES(1, 
'foo');"
INSERT 0 1


##
## 3. Start Subscription and pause the tablesync after its slot is created but 
before the copy starts
##

[postgres@CentOS7-x64 ~]$ psql -d test_sub -p 54321 -c "CREATE SUBSCRIPTION 
tap_sub CONNECTION 'host=localhost dbname=test_pub application_name=tap_sub' 
PUBLICATION tap_pub;"
2021-02-02 16:02:44.094 AEDT [24946] LOG:  logical decoding found consistent 
point at 0/1653EB0
2021-02-02 16:02:44.094 AEDT [24946] DETAIL:  There are no running transactions.
2021-02-02 16:02:44.094 AEDT [24946] STATEMENT:  CREATE_REPLICATION_SLOT 
"tap_sub" LOGICAL pgoutput NOEXPORT_SNAPSHOT
NOTICE:  created replication slot "tap_sub" on publisher
CREATE SUBSCRIPTION
2021-02-02 16:02:44.103 AEDT [24947] LOG:  logical replication apply worker for 
subscription "tap_sub" has started
2021-02-02 16:02:44.103 AEDT [24947] LOG:  !!>> The apply worker process has 
PID = 24947
[postgres@CentOS7-x64 ~]$ 2021-02-02 16:02:44.110 AEDT [24954] LOG:  starting 
logical decoding for slot "tap_sub"
2021-02-02 16:02:44.110 AEDT [24954] DETAIL:  Streaming transactions committing 
after 0/1653EE8, reading WAL from 0/1653EB0.
2021-02-02 16:02:44.110 AEDT [24954] STATEMENT:  START_REPLICATION SLOT 
"tap_sub" LOGICAL 0/0 (proto_version '2', publication_names '"tap_pub"')
2021-02-02 16:02:44.110 AEDT [24954] LOG:  logical decoding found consistent 
point at 0/1653EB0
2021-02-02 16:02:44.110 AEDT [24954] DETAIL:  There are no running transactions.
2021-02-02 16:02:44.110 AEDT [24954] STATEMENT:  START_REPLICATION SLOT 
"tap_sub" LOGICAL 0/0 (proto_version '2', publication_names '"tap_pub"')
2021-02-02 16:02:44.110 AEDT [24947] LOG:  !!>> apply worker: 
LogicalRepApplyLoop
2021-02-02 16:02:44.110 AEDT [24947] LOG:  !!>> apply worker: called 
process_syncing_tables
2021-02-02 16:02:44.115 AEDT [24955] LOG:  logical replication table 
synchronization worker for subscription "tap_sub", table "test_tab" has started
2021-02-02 16:02:44.115 AEDT [24955] LOG:  !!>> The tablesync worker process 
has PID = 24955
2021-02-02 16:02:44.115 AEDT [24955] LOG:  !!>>


Sleeping 30 secs. For debugging, attach to process 24955 now!


# sync worker is PID 24955
# apply worker is PID 24947

2021-02-02 16:02:44.122 AEDT [24947] LOG:  !!>> apply worker: 
LogicalRepApplyLoop
2021-02-02 16:02:44.122 AEDT 

Re: Single transaction in the tablesync worker?

2021-02-01 Thread Ajin Cherian
Another failure I see in my testing

On publisher create a big enough table:
publisher:
postgres=# CREATE TABLE tab_rep (a int primary key);CREATE TABLE
postgres=# INSERT INTO tab_rep SELECT generate_series(1,100);
INSERT 0 100
postgres=# CREATE PUBLICATION tap_pub FOR ALL TABLES;
CREATE PUBLICATION

Subscriber:
postgres=# CREATE TABLE tab_rep (a int primary key);
CREATE TABLE
postgres=# CREATE SUBSCRIPTION tap_sub CONNECTION 'host=localhost
dbname=postgres port=6972' PUBLICATION tap_pub WITH (enabled = false);

Create the subscription but do not enable it:
The below two commands on the subscriber should be issued quickly with
no delay between them.

postgres=# ALTER SUBSCRIPTION tap_sub enable;
ALTER SUBSCRIPTION
postgres=# ALTER SUBSCRIPTION tap_sub disable;
ALTER SUBSCRIPTION

This leaves the below state for the pg_subscription rel:
postgres=# select * from pg_subscription_rel;
 srsubid | srrelid | srsubstate | srsublsn
-+-++--
   16395 |   16384 | f  |
(1 row)

The rel is in the SUBREL_STATE_FINISHEDCOPY state.

Meanwhile on the publisher, looking at the slots created:

postgres=# select * from pg_replication_slots;
  slot_name  |  plugin  | slot_type | datoid | database |
temporary | active | active_pid | x
min | catalog_xmin | restart_lsn | confirmed_flush_lsn | wal_status |
safe_wal_size
-+--+---++--+---+++--
+--+-+-++---
 tap_sub | pgoutput | logical   |  13859 | postgres | f
 | f  ||
|  517 | 0/9303660   | 0/9303698   | reserved   |
 pg_16395_sync_16384 | pgoutput | logical   |  13859 | postgres | f
 | f  ||
|  517 | 0/9303660   | 0/9303698   | reserved   |
(2 rows)


There are two slots, the main slot as well as the tablesync slot, drop
the table, re-enable the subscription and then drop the subscription

Now on the subscriber:
postgres=# drop table tab_rep;
DROP TABLE
postgres=# ALTER SUBSCRIPTION tap_sub enable;
ALTER SUBSCRIPTION
postgres=# drop subscription tap_sub ;
NOTICE:  dropped replication slot "tap_sub" on publisher
DROP SUBSCRIPTION

We see the tablesync slot dangling in the publisher:
postgres=# select * from pg_replication_slots;
  slot_name  |  plugin  | slot_type | datoid | database |
temporary | active | active_pid | x
min | catalog_xmin | restart_lsn | confirmed_flush_lsn | wal_status |
safe_wal_size
-+--+---++--+---+++--
+--+-+-++---
  pg_16395_sync_16384 | pgoutput | logical   |  13859 | postgres | f
  | f  ||
|  517 | 0/9303660   | 0/9303698   | reserved   |
(1 row)

The dropping of the table, meant that after the tablesync is
restarted, the worker has no idea about the old slot created as its
name uses the relid of the dropped table.

regards,
Ajin Cherian
Fujitsu Australia




Re: Single transaction in the tablesync worker?

2021-02-01 Thread Ajin Cherian
On Mon, Feb 1, 2021 at 11:26 PM Amit Kapila  wrote:
> I have updated the patch to display WARNING for each of the tablesync
> slots during DropSubscription. As discussed, I have moved the drop
> slot related code towards the end in AlterSubscription_refresh. Apart
> from this, I have fixed one more issue in tablesync code where in
> after catching the exception we were not clearing the transaction
> state on the publisher, see changes in LogicalRepSyncTableStart. I
> have also fixed other comments raised by you. Additionally, I have
> removed the test because it was creating the same name slot as the
> tablesync worker and tablesync worker removed the same due to new
> logic in LogicalRepSyncStart. Earlier, it was not failing because of
> the bug in that code which I have fixed in the attached.
>

I was testing this patch. I had a table on the subscriber which had a
row that would cause a PK constraint
violation during the table copy. This is resulting in the subscriber
trying to rollback the table copy and failing.

2021-02-01 23:28:16.041 EST [23738] LOG:  logical replication apply
worker for subscription "tap_sub" has started
2021-02-01 23:28:16.051 EST [23740] LOG:  logical replication table
synchronization worker for subscription "tap_sub", table "tab_rep" has
started
2021-02-01 23:28:21.118 EST [23740] ERROR:  table copy could not
rollback transaction on publisher
2021-02-01 23:28:21.118 EST [23740] DETAIL:  The error was: another
command is already in progress
2021-02-01 23:28:21.122 EST [8028] LOG:  background worker "logical
replication worker" (PID 23740) exited with exit code 1
2021-02-01 23:28:21.125 EST [23908] LOG:  logical replication table
synchronization worker for subscription "tap_sub", table "tab_rep" has
started
2021-02-01 23:28:21.138 EST [23908] ERROR:  could not create
replication slot "pg_16398_sync_16384": ERROR:  replication slot
"pg_16398_sync_16384" already exists
2021-02-01 23:28:21.139 EST [8028] LOG:  background worker "logical
replication worker" (PID 23908) exited with exit code 1
2021-02-01 23:28:26.168 EST [24048] LOG:  logical replication table
synchronization worker for subscription "tap_sub", table "tab_rep" has
started
2021-02-01 23:28:34.244 EST [24048] ERROR:  table copy could not
rollback transaction on publisher
2021-02-01 23:28:34.244 EST [24048] DETAIL:  The error was: another
command is already in progress
2021-02-01 23:28:34.251 EST [8028] LOG:  background worker "logical
replication worker" (PID 24048) exited with exit code 1
2021-02-01 23:28:34.254 EST [24337] LOG:  logical replication table
synchronization worker for subscription "tap_sub", table "tab_rep" has
started
2021-02-01 23:28:34.263 EST [24337] ERROR:  could not create
replication slot "pg_16398_sync_16384": ERROR:  replication slot
"pg_16398_sync_16384" already exists
2021-02-01 23:28:34.264 EST [8028] LOG:  background worker "logical
replication worker" (PID 24337) exited with exit code 1

And one more thing I see is that now we error out in PG_CATCH() in
LogicalRepSyncTableStart() with the above error and as a result, the
tablesync slot is not dropped. Hence causing the slot create to fail
in the next restart.
I think this can be avoided. We could either attempt a rollback only
on specific failures and drop slot prior to erroring out.

regards,
Ajin Cherian
Fujitsu Australia




Re: Single transaction in the tablesync worker?

2021-02-01 Thread Amit Kapila
On Tue, Feb 2, 2021 at 8:29 AM Peter Smith  wrote:
>
> On Mon, Feb 1, 2021 at 11:26 PM Amit Kapila  wrote:
>
> > I have updated the patch to display WARNING for each of the tablesync
> > slots during DropSubscription. As discussed, I have moved the drop
> > slot related code towards the end in AlterSubscription_refresh. Apart
> > from this, I have fixed one more issue in tablesync code where in
> > after catching the exception we were not clearing the transaction
> > state on the publisher, see changes in LogicalRepSyncTableStart. I
> > have also fixed other comments raised by you.
>
> Here are some additional feedback comments about the v24 patch:
>
> ~~
>
> ReportSlotConnectionError:
>
> 1,2,3,4.
> + foreach(lc, rstates)
> + {
> + SubscriptionRelState *rstate = (SubscriptionRelState *) lfirst(lc);
> + Oid relid = rstate->relid;
> +
> + /* Only cleanup resources of tablesync workers */
> + if (!OidIsValid(relid))
> + continue;
> +
> + /*
> + * Caller needs to ensure that we have appropriate locks so that
> + * relstate doesn't change underneath us.
> + */
> + if (rstate->state != SUBREL_STATE_SYNCDONE)
> + {
> + char syncslotname[NAMEDATALEN] = { 0 };
> +
> + ReplicationSlotNameForTablesync(subid, relid, syncslotname);
> + elog(WARNING, "could not drop tablesync replication slot \"%s\"",
> + syncslotname);
> +
> + }
> + }
>
> 1. I wonder if "rstates" would be better named something like
> "not_ready_rstates", otherwise it is not apparent what states are in
> this list
>

I don't know if that would be better and it is used in the same way in
the existing code. I find the current naming succinct.

> 2. The comment "/* Only cleanup resources of tablesync workers */" is
> not quite correct because there is no cleanup happening here. Maybe
> change to:
> if (!OidIsValid(relid))
> continue; /* not a tablesync worker */
>

Aren't we trying to cleanup the tablesync slots here? So, I don't see
the comment as irrelevant.

> 3. Maybe the "appropriate locks" comment can say what locks are the
> "appropriate" ones?
>
> 4. Spurious blank line after the elog?
>

Will fix both the above.

> ~~
>
> AlterSubscription_refresh:
>
> 5.
> + /*
> + * Drop the tablesync slot. This has to be at the end because
> otherwise if there
> + * is an error while doing the database operations we won't be able to 
> rollback
> + * dropped slot.
> + */
>
> Maybe "Drop the tablesync slot." should say "Drop the tablesync slots
> associated with removed tables."
>

makes sense, will fix.

> ~~
>
> DropSubscription:
>
> 6.
> + /*
> + * Cleanup of tablesync replication origins.
> + *
> + * Any READY-state relations would already have dealt with clean-ups.
> + *
> + * Note that the state can't change because we have already stopped both
> + * the apply and tablesync workers and they can't restart because of
> + * exclusive lock on the subscription.
> + */
> + rstates = GetSubscriptionNotReadyRelations(subid);
> + foreach(lc, rstates)
>
> I wonder if "rstates" would be better named as "not_ready_rstates",
> because it is used in several places where not READY is assumed.
>

Same response as above for similar comment.

> 7.
> + {
> + if (!slotname)
> + {
> + /* be tidy */
> + list_free(rstates);
> + return;
> + }
> + else
> + {
> + ReportSlotConnectionError(rstates, subid, slotname, err);
> + }
> +
> + }
>
> Spurious blank line above?
>

Will fix.

> 8.
> The new logic of calling the ReportSlotConnectionError seems to be
> expecting that the user has encountered some connection error, and
> *after* that they have assigned slot_name = NONE as a workaround. In
> this scenario the code looks ok since names of any dangling tablesync
> slots were being logged at the time of the error.
>
> But I am wondering what about where the user might have set slot_name
> = NONE *before* the connection is broken. In this scenario, there is
> no ERROR, so if there are still (is it possible?) dangling tablesync
> slots, their names are never getting logged at all. So how can the
> user know what to delete?
>

It has been mentioned in docs that the user is responsible for
cleaning that up manually in such a case. The patch has also described
how the names are generated so that can help user to remove those.
+These table synchronization slots have generated names:
+pg_%u_sync_%u (parameters: Subscription
+oid, Table relid)

I think if the user changes slot_name associated with the
subscription, it would be his responsibility to clean up the
previously associated slot. This is currently the case with the main
subscription slot as well. I think it won't be advisable for the user
to change slot_name unless under some rare cases where the system
might be stuck like the one for which we are giving WARNING and
providing a hint for setting the slot_name to NONE.


> ~~
>
> > Additionally, I have
> > removed the test because it was creating the same name slot as the
> > tablesync worker and tablesync worker removed the same due to new
> > logic in 

Re: Single transaction in the tablesync worker?

2021-02-01 Thread Peter Smith
On Mon, Feb 1, 2021 at 11:26 PM Amit Kapila  wrote:

> I have updated the patch to display WARNING for each of the tablesync
> slots during DropSubscription. As discussed, I have moved the drop
> slot related code towards the end in AlterSubscription_refresh. Apart
> from this, I have fixed one more issue in tablesync code where in
> after catching the exception we were not clearing the transaction
> state on the publisher, see changes in LogicalRepSyncTableStart. I
> have also fixed other comments raised by you.

Here are some additional feedback comments about the v24 patch:

~~

ReportSlotConnectionError:

1,2,3,4.
+ foreach(lc, rstates)
+ {
+ SubscriptionRelState *rstate = (SubscriptionRelState *) lfirst(lc);
+ Oid relid = rstate->relid;
+
+ /* Only cleanup resources of tablesync workers */
+ if (!OidIsValid(relid))
+ continue;
+
+ /*
+ * Caller needs to ensure that we have appropriate locks so that
+ * relstate doesn't change underneath us.
+ */
+ if (rstate->state != SUBREL_STATE_SYNCDONE)
+ {
+ char syncslotname[NAMEDATALEN] = { 0 };
+
+ ReplicationSlotNameForTablesync(subid, relid, syncslotname);
+ elog(WARNING, "could not drop tablesync replication slot \"%s\"",
+ syncslotname);
+
+ }
+ }

1. I wonder if "rstates" would be better named something like
"not_ready_rstates", otherwise it is not apparent what states are in
this list

2. The comment "/* Only cleanup resources of tablesync workers */" is
not quite correct because there is no cleanup happening here. Maybe
change to:
if (!OidIsValid(relid))
continue; /* not a tablesync worker */

3. Maybe the "appropriate locks" comment can say what locks are the
"appropriate" ones?

4. Spurious blank line after the elog?

~~

AlterSubscription_refresh:

5.
+ /*
+ * Drop the tablesync slot. This has to be at the end because
otherwise if there
+ * is an error while doing the database operations we won't be able to rollback
+ * dropped slot.
+ */

Maybe "Drop the tablesync slot." should say "Drop the tablesync slots
associated with removed tables."

~~

DropSubscription:

6.
+ /*
+ * Cleanup of tablesync replication origins.
+ *
+ * Any READY-state relations would already have dealt with clean-ups.
+ *
+ * Note that the state can't change because we have already stopped both
+ * the apply and tablesync workers and they can't restart because of
+ * exclusive lock on the subscription.
+ */
+ rstates = GetSubscriptionNotReadyRelations(subid);
+ foreach(lc, rstates)

I wonder if "rstates" would be better named as "not_ready_rstates",
because it is used in several places where not READY is assumed.

7.
+ {
+ if (!slotname)
+ {
+ /* be tidy */
+ list_free(rstates);
+ return;
+ }
+ else
+ {
+ ReportSlotConnectionError(rstates, subid, slotname, err);
+ }
+
+ }

Spurious blank line above?

8.
The new logic of calling the ReportSlotConnectionError seems to be
expecting that the user has encountered some connection error, and
*after* that they have assigned slot_name = NONE as a workaround. In
this scenario the code looks ok since names of any dangling tablesync
slots were being logged at the time of the error.

But I am wondering what about where the user might have set slot_name
= NONE *before* the connection is broken. In this scenario, there is
no ERROR, so if there are still (is it possible?) dangling tablesync
slots, their names are never getting logged at all. So how can the
user know what to delete?

~~

> Additionally, I have
> removed the test because it was creating the same name slot as the
> tablesync worker and tablesync worker removed the same due to new
> logic in LogicalRepSyncStart. Earlier, it was not failing because of
> the bug in that code which I have fixed in the attached.

Wasn't causing a tablesync slot clash and seeing if it could recover
the point of that test? Why not just keep, and modify the test to make
it work again? Isn't it still valuable because at least it would
execute the code through the PG_CATCH which otherwise may not get
executed by any other test?

>
> I wonder whether we should restrict creating slots with prefix pg_
> because we are internally creating slots with those names? I think
> this was a problem previously also. We already prohibit it for few
> other objects like origins, schema, etc., see the usage of
> IsReservedName.
>

Yes, we could restrict the create slot API if you really wanted to.
But, IMO it is implausible that a user could "accidentally" clash with
the internal tablesync slot name, so in practice maybe this change
would not help much but it might make it more difficult to test some
scenarios.


Kind Regards,
Peter Smith.
Fujitsu Australia




Re: Single transaction in the tablesync worker?

2021-02-01 Thread Amit Kapila
On Mon, Feb 1, 2021 at 11:23 AM Peter Smith  wrote:
>
> On Mon, Feb 1, 2021 at 3:44 PM Amit Kapila  wrote:
> >
> > On Mon, Feb 1, 2021 at 9:38 AM Peter Smith  wrote:
> >
> > No, there is a functionality change as well. The way we have code in
> > v22 can easily lead to a problem when we have dropped the slots but
> > get an error while removing origins or an entry from subscription rel.
> > In such cases, we won't be able to rollback the drop of slots but the
> > other database operations will be rolled back. This is the reason we
> > have to drop the slots at the end. We need to ensure the same thing
> > for AlterSubscription_refresh. Does this make sense now?
> >
>
> OK.
>

I have updated the patch to display WARNING for each of the tablesync
slots during DropSubscription. As discussed, I have moved the drop
slot related code towards the end in AlterSubscription_refresh. Apart
from this, I have fixed one more issue in tablesync code where in
after catching the exception we were not clearing the transaction
state on the publisher, see changes in LogicalRepSyncTableStart. I
have also fixed other comments raised by you. Additionally, I have
removed the test because it was creating the same name slot as the
tablesync worker and tablesync worker removed the same due to new
logic in LogicalRepSyncStart. Earlier, it was not failing because of
the bug in that code which I have fixed in the attached.

I wonder whether we should restrict creating slots with prefix pg_
because we are internally creating slots with those names? I think
this was a problem previously also. We already prohibit it for few
other objects like origins, schema, etc., see the usage of
IsReservedName.




--
With Regards,
Amit Kapila.


v24-0001-Tablesync-Solution1.patch
Description: Binary data


Re: Single transaction in the tablesync worker?

2021-02-01 Thread Amit Kapila
On Mon, Feb 1, 2021 at 1:08 PM Peter Smith  wrote:
>
> On Mon, Feb 1, 2021 at 5:19 PM Amit Kapila  wrote:
>
>  > > > AFAIK there is always a potential race with DropSubscription dropping
> > > > > slots. The DropSubscription might be running at exactly the same time
> > > > > the apply worker has just dropped the very same tablesync slot.
> > > > >
> > > >
> > > > We stopped the workers before getting a list of NotReady relations and
> > > > then we try to drop the corresponding slots. So, how such a race
> > > > condition can happen? Note, because we have a lock on pg_subscrition,
> > > > there is no chance that the workers can restart till the transaction
> > > > end.
> > >
> > > OK. I think I was forgetting the logicalrep_worker_stop would also go
> > > into a loop waiting for the worker process to die. So even if the
> > > tablesync worker does simultaneously drop it's own slot, I think it
> > > will certainly at least be in SYNCDONE state before DropSubscription
> > > does anything else with that worker.
> > >
> >
> > How is that ensured? We don't have anything like HOLD_INTERRUPTS
> > between the time dropped the slot and updated rel state as SYNCDONE.
> > So, isn't it possible that after we dropped the slot and before we
> > update the state, the SIGTERM signal arrives and led to worker exit?
> >
>
> The worker has the SIGTERM handler of "die". IIUC the "die" function
> doesn't normally do anything except set some flags to say please die
> at the next convenient opportunity. My understanding is that the
> worker process will not actually exit until it next executes
> CHECK_FOR_INTERRUPTS(), whereupon it will see the ProcDiePending flag
> and *really* die. So even if the SIGTERM signal arrives immediately
> after the slot is dropped, the tablesync will still become SYNCDONE.
> Is this wrong understanding?
>
> But your scenario could still be possible if "die" exited immediately
> (e.g. only in single user mode?).
>

I think it is possible without that as well. There are many calls
in-between those two operations which can internally call
CHECK_FOR_INTERRUPTS. One of the flows where such a possibility exists
is 
UpdateSubscriptionRelState->SearchSysCacheCopy2->SearchSysCacheCopy->SearchSysCache->SearchCatCache->SearchCatCacheInternal->SearchCatCacheMiss->systable_getnext.
This can internally call heapgetpage where we have
CHECK_FOR_INTERRUPTS. I think even if today there was no CFI call we
can't take a guarantee for the future as the calls used are quite
common. So, probably we need missing_ok flag in DropSubscription.

One more point in the tablesync code you are calling
ReplicationSlotDropAtPubNode with missing_ok as false. What if we get
an error after that and before we have marked the state as SYNCDONE? I
guess it will always error from ReplicationSlotDropAtPubNode after
that because we had already dropped the slot.

-- 
With Regards,
Amit Kapila.




Re: Single transaction in the tablesync worker?

2021-01-31 Thread Peter Smith
On Mon, Feb 1, 2021 at 5:19 PM Amit Kapila  wrote:

 > > > AFAIK there is always a potential race with DropSubscription dropping
> > > > slots. The DropSubscription might be running at exactly the same time
> > > > the apply worker has just dropped the very same tablesync slot.
> > > >
> > >
> > > We stopped the workers before getting a list of NotReady relations and
> > > then we try to drop the corresponding slots. So, how such a race
> > > condition can happen? Note, because we have a lock on pg_subscrition,
> > > there is no chance that the workers can restart till the transaction
> > > end.
> >
> > OK. I think I was forgetting the logicalrep_worker_stop would also go
> > into a loop waiting for the worker process to die. So even if the
> > tablesync worker does simultaneously drop it's own slot, I think it
> > will certainly at least be in SYNCDONE state before DropSubscription
> > does anything else with that worker.
> >
>
> How is that ensured? We don't have anything like HOLD_INTERRUPTS
> between the time dropped the slot and updated rel state as SYNCDONE.
> So, isn't it possible that after we dropped the slot and before we
> update the state, the SIGTERM signal arrives and led to worker exit?
>

The worker has the SIGTERM handler of "die". IIUC the "die" function
doesn't normally do anything except set some flags to say please die
at the next convenient opportunity. My understanding is that the
worker process will not actually exit until it next executes
CHECK_FOR_INTERRUPTS(), whereupon it will see the ProcDiePending flag
and *really* die. So even if the SIGTERM signal arrives immediately
after the slot is dropped, the tablesync will still become SYNCDONE.
Is this wrong understanding?

But your scenario could still be possible if "die" exited immediately
(e.g. only in single user mode?).


Kind Regards,
Peter Smith.
Fujitsu Australia




Re: Single transaction in the tablesync worker?

2021-01-31 Thread Amit Kapila
On Mon, Feb 1, 2021 at 11:23 AM Peter Smith  wrote:
>
> On Mon, Feb 1, 2021 at 3:44 PM Amit Kapila  wrote:
> >
> > On Mon, Feb 1, 2021 at 9:38 AM Peter Smith  wrote:
> > >
> > > > I think this is true only when the user specifically requested it by
> > > > the use of "ALTER SUBSCRIPTION ... SET (slot_name = NONE)", right?
> > > > Otherwise, we give an error on a broken connection. Also, if that is
> > > > true then is there a reason to pass missing_ok as true while dropping
> > > > tablesync slots?
> > > >
> > >
> > > AFAIK there is always a potential race with DropSubscription dropping
> > > slots. The DropSubscription might be running at exactly the same time
> > > the apply worker has just dropped the very same tablesync slot.
> > >
> >
> > We stopped the workers before getting a list of NotReady relations and
> > then we try to drop the corresponding slots. So, how such a race
> > condition can happen? Note, because we have a lock on pg_subscrition,
> > there is no chance that the workers can restart till the transaction
> > end.
>
> OK. I think I was forgetting the logicalrep_worker_stop would also go
> into a loop waiting for the worker process to die. So even if the
> tablesync worker does simultaneously drop it's own slot, I think it
> will certainly at least be in SYNCDONE state before DropSubscription
> does anything else with that worker.
>

How is that ensured? We don't have anything like HOLD_INTERRUPTS
between the time dropped the slot and updated rel state as SYNCDONE.
So, isn't it possible that after we dropped the slot and before we
update the state, the SIGTERM signal arrives and led to worker exit?

-- 
With Regards,
Amit Kapila.




Re: Single transaction in the tablesync worker?

2021-01-31 Thread Peter Smith
On Mon, Feb 1, 2021 at 3:44 PM Amit Kapila  wrote:
>
> On Mon, Feb 1, 2021 at 9:38 AM Peter Smith  wrote:
> >
> > On Mon, Feb 1, 2021 at 1:54 PM Amit Kapila  wrote:
> > >
> > > On Mon, Feb 1, 2021 at 6:48 AM Peter Smith  wrote:
> > > >
> > > > On Sun, Jan 31, 2021 at 12:19 AM Amit Kapila  
> > > > wrote:
> > > > >
> > > > > I have made the below changes in the patch. Let me know what you think
> > > > > about these?
> > > > > 1. It was a bit difficult to understand the code in DropSubscription
> > > > > so I have rearranged the code to match the way we are doing in HEAD
> > > > > where we drop the slots at the end after finishing all the other
> > > > > cleanup.
> > > >
> > > > There was a reason why the v22 logic was different from HEAD.
> > > >
> > > > The broken connection leaves dangling slots which is unavoidable.
> > > >
> > >
> > > I think this is true only when the user specifically requested it by
> > > the use of "ALTER SUBSCRIPTION ... SET (slot_name = NONE)", right?
> > > Otherwise, we give an error on a broken connection. Also, if that is
> > > true then is there a reason to pass missing_ok as true while dropping
> > > tablesync slots?
> > >
> >
> > AFAIK there is always a potential race with DropSubscription dropping
> > slots. The DropSubscription might be running at exactly the same time
> > the apply worker has just dropped the very same tablesync slot.
> >
>
> We stopped the workers before getting a list of NotReady relations and
> then we try to drop the corresponding slots. So, how such a race
> condition can happen? Note, because we have a lock on pg_subscrition,
> there is no chance that the workers can restart till the transaction
> end.

OK. I think I was forgetting the logicalrep_worker_stop would also go
into a loop waiting for the worker process to die. So even if the
tablesync worker does simultaneously drop it's own slot, I think it
will certainly at least be in SYNCDONE state before DropSubscription
does anything else with that worker.

>
> > By
> > saying missing_ok = true it means DropSubscription would not give
> > ERROR in such a case, so at least the DROP SUBSCRIPTION would not fail
> > with an unexpected error.
> >
> > >
> > > > But,
> > > > whereas the user knows the name of the Subscription slot (they named
> > > > it), there is no easy way for them to know the names of the remaining
> > > > tablesync slots unless we log them.
> > > >
> > > > That is why the v22 code was written to process the tablesync slots
> > > > even for wrconn == NULL so their name could be logged:
> > > > elog(WARNING, "no connection; cannot drop tablesync slot \"%s\".",
> > > > syncslotname);
> > > >
> > > > The v23 patch removed this dangling slot name information, so it makes
> > > > it difficult for the user to know what tablesync slots to cleanup.
> > > >
> > >
> > > Okay, then can we think of combining with the existing error of the
> > > replication slot? I think that might produce a very long message, so
> > > another idea could be to LOG a separate WARNING for each such slot
> > > just before giving the error.
> >
> > There may be many subscribed tables so I agree combining to one
> > message might be too long. Yes, we can add another loop to output the
> > necessary information. But, isn’t logging each tablesync slot WARNING
> > before the subscription slot ERROR exactly the behaviour which already
> > existed in v22. IIUC the DropSubscription refactoring in V23 was not
> > done for any functional change, but was intended only to make the code
> > simpler, but how is that goal achieved if v23 ends up needing 3 loops
> > where v22 only needed 1 loop to do the same thing?
> >
>
> No, there is a functionality change as well. The way we have code in
> v22 can easily lead to a problem when we have dropped the slots but
> get an error while removing origins or an entry from subscription rel.
> In such cases, we won't be able to rollback the drop of slots but the
> other database operations will be rolled back. This is the reason we
> have to drop the slots at the end. We need to ensure the same thing
> for AlterSubscription_refresh. Does this make sense now?
>

OK.


Kind Regards,
Peter Smith.
Fujitsu Australia.




Re: Single transaction in the tablesync worker?

2021-01-31 Thread Amit Kapila
On Mon, Feb 1, 2021 at 10:14 AM Amit Kapila  wrote:
>
> On Mon, Feb 1, 2021 at 9:38 AM Peter Smith  wrote:
> >
> > On Mon, Feb 1, 2021 at 1:54 PM Amit Kapila  wrote:
> > >
> > > On Mon, Feb 1, 2021 at 6:48 AM Peter Smith  wrote:
> > > >
> > > > On Sun, Jan 31, 2021 at 12:19 AM Amit Kapila  
> > > > wrote:
> > > > >
> > > > > I have made the below changes in the patch. Let me know what you think
> > > > > about these?
> > > > > 1. It was a bit difficult to understand the code in DropSubscription
> > > > > so I have rearranged the code to match the way we are doing in HEAD
> > > > > where we drop the slots at the end after finishing all the other
> > > > > cleanup.
> > > >
> > > > There was a reason why the v22 logic was different from HEAD.
> > > >
> > > > The broken connection leaves dangling slots which is unavoidable.
> > > >
> > >
> > > I think this is true only when the user specifically requested it by
> > > the use of "ALTER SUBSCRIPTION ... SET (slot_name = NONE)", right?
> > > Otherwise, we give an error on a broken connection. Also, if that is
> > > true then is there a reason to pass missing_ok as true while dropping
> > > tablesync slots?
> > >
> >
> > AFAIK there is always a potential race with DropSubscription dropping
> > slots. The DropSubscription might be running at exactly the same time
> > the apply worker has just dropped the very same tablesync slot.
> >
>
> We stopped the workers before getting a list of NotReady relations and
> then we try to drop the corresponding slots. So, how such a race
> condition can happen?
>

I think it is possible that the state is still not SYNCDONE but the
slot is already dropped so here we should be ready with the missing
slot.


-- 
With Regards,
Amit Kapila.




Re: Single transaction in the tablesync worker?

2021-01-31 Thread Amit Kapila
On Mon, Feb 1, 2021 at 9:38 AM Peter Smith  wrote:
>
> On Mon, Feb 1, 2021 at 1:54 PM Amit Kapila  wrote:
> >
> > On Mon, Feb 1, 2021 at 6:48 AM Peter Smith  wrote:
> > >
> > > On Sun, Jan 31, 2021 at 12:19 AM Amit Kapila  
> > > wrote:
> > > >
> > > > I have made the below changes in the patch. Let me know what you think
> > > > about these?
> > > > 1. It was a bit difficult to understand the code in DropSubscription
> > > > so I have rearranged the code to match the way we are doing in HEAD
> > > > where we drop the slots at the end after finishing all the other
> > > > cleanup.
> > >
> > > There was a reason why the v22 logic was different from HEAD.
> > >
> > > The broken connection leaves dangling slots which is unavoidable.
> > >
> >
> > I think this is true only when the user specifically requested it by
> > the use of "ALTER SUBSCRIPTION ... SET (slot_name = NONE)", right?
> > Otherwise, we give an error on a broken connection. Also, if that is
> > true then is there a reason to pass missing_ok as true while dropping
> > tablesync slots?
> >
>
> AFAIK there is always a potential race with DropSubscription dropping
> slots. The DropSubscription might be running at exactly the same time
> the apply worker has just dropped the very same tablesync slot.
>

We stopped the workers before getting a list of NotReady relations and
then we try to drop the corresponding slots. So, how such a race
condition can happen? Note, because we have a lock on pg_subscrition,
there is no chance that the workers can restart till the transaction
end.

> By
> saying missing_ok = true it means DropSubscription would not give
> ERROR in such a case, so at least the DROP SUBSCRIPTION would not fail
> with an unexpected error.
>
> >
> > > But,
> > > whereas the user knows the name of the Subscription slot (they named
> > > it), there is no easy way for them to know the names of the remaining
> > > tablesync slots unless we log them.
> > >
> > > That is why the v22 code was written to process the tablesync slots
> > > even for wrconn == NULL so their name could be logged:
> > > elog(WARNING, "no connection; cannot drop tablesync slot \"%s\".",
> > > syncslotname);
> > >
> > > The v23 patch removed this dangling slot name information, so it makes
> > > it difficult for the user to know what tablesync slots to cleanup.
> > >
> >
> > Okay, then can we think of combining with the existing error of the
> > replication slot? I think that might produce a very long message, so
> > another idea could be to LOG a separate WARNING for each such slot
> > just before giving the error.
>
> There may be many subscribed tables so I agree combining to one
> message might be too long. Yes, we can add another loop to output the
> necessary information. But, isn’t logging each tablesync slot WARNING
> before the subscription slot ERROR exactly the behaviour which already
> existed in v22. IIUC the DropSubscription refactoring in V23 was not
> done for any functional change, but was intended only to make the code
> simpler, but how is that goal achieved if v23 ends up needing 3 loops
> where v22 only needed 1 loop to do the same thing?
>

No, there is a functionality change as well. The way we have code in
v22 can easily lead to a problem when we have dropped the slots but
get an error while removing origins or an entry from subscription rel.
In such cases, we won't be able to rollback the drop of slots but the
other database operations will be rolled back. This is the reason we
have to drop the slots at the end. We need to ensure the same thing
for AlterSubscription_refresh. Does this make sense now?

-- 
With Regards,
Amit Kapila.




Re: Single transaction in the tablesync worker?

2021-01-31 Thread Peter Smith
On Mon, Feb 1, 2021 at 1:54 PM Amit Kapila  wrote:
>
> On Mon, Feb 1, 2021 at 6:48 AM Peter Smith  wrote:
> >
> > On Sun, Jan 31, 2021 at 12:19 AM Amit Kapila  
> > wrote:
> > >
> > > I have made the below changes in the patch. Let me know what you think
> > > about these?
> > > 1. It was a bit difficult to understand the code in DropSubscription
> > > so I have rearranged the code to match the way we are doing in HEAD
> > > where we drop the slots at the end after finishing all the other
> > > cleanup.
> >
> > There was a reason why the v22 logic was different from HEAD.
> >
> > The broken connection leaves dangling slots which is unavoidable.
> >
>
> I think this is true only when the user specifically requested it by
> the use of "ALTER SUBSCRIPTION ... SET (slot_name = NONE)", right?
> Otherwise, we give an error on a broken connection. Also, if that is
> true then is there a reason to pass missing_ok as true while dropping
> tablesync slots?
>

AFAIK there is always a potential race with DropSubscription dropping
slots. The DropSubscription might be running at exactly the same time
the apply worker has just dropped the very same tablesync slot. By
saying missing_ok = true it means DropSubscription would not give
ERROR in such a case, so at least the DROP SUBSCRIPTION would not fail
with an unexpected error.

>
> > But,
> > whereas the user knows the name of the Subscription slot (they named
> > it), there is no easy way for them to know the names of the remaining
> > tablesync slots unless we log them.
> >
> > That is why the v22 code was written to process the tablesync slots
> > even for wrconn == NULL so their name could be logged:
> > elog(WARNING, "no connection; cannot drop tablesync slot \"%s\".",
> > syncslotname);
> >
> > The v23 patch removed this dangling slot name information, so it makes
> > it difficult for the user to know what tablesync slots to cleanup.
> >
>
> Okay, then can we think of combining with the existing error of the
> replication slot? I think that might produce a very long message, so
> another idea could be to LOG a separate WARNING for each such slot
> just before giving the error.

There may be many subscribed tables so I agree combining to one
message might be too long. Yes, we can add another loop to output the
necessary information. But, isn’t logging each tablesync slot WARNING
before the subscription slot ERROR exactly the behaviour which already
existed in v22. IIUC the DropSubscription refactoring in V23 was not
done for any functional change, but was intended only to make the code
simpler, but how is that goal achieved if v23 ends up needing 3 loops
where v22 only needed 1 loop to do the same thing?


Kind Regards,
Peter Smith.
Fujitsu Australia.




Re: Single transaction in the tablesync worker?

2021-01-31 Thread Amit Kapila
On Mon, Feb 1, 2021 at 6:48 AM Peter Smith  wrote:
>
> On Sun, Jan 31, 2021 at 12:19 AM Amit Kapila  wrote:
> >
> > I have made the below changes in the patch. Let me know what you think
> > about these?
> > 1. It was a bit difficult to understand the code in DropSubscription
> > so I have rearranged the code to match the way we are doing in HEAD
> > where we drop the slots at the end after finishing all the other
> > cleanup.
>
> There was a reason why the v22 logic was different from HEAD.
>
> The broken connection leaves dangling slots which is unavoidable.
>

I think this is true only when the user specifically requested it by
the use of "ALTER SUBSCRIPTION ... SET (slot_name = NONE)", right?
Otherwise, we give an error on a broken connection. Also, if that is
true then is there a reason to pass missing_ok as true while dropping
tablesync slots?


> But,
> whereas the user knows the name of the Subscription slot (they named
> it), there is no easy way for them to know the names of the remaining
> tablesync slots unless we log them.
>
> That is why the v22 code was written to process the tablesync slots
> even for wrconn == NULL so their name could be logged:
> elog(WARNING, "no connection; cannot drop tablesync slot \"%s\".",
> syncslotname);
>
> The v23 patch removed this dangling slot name information, so it makes
> it difficult for the user to know what tablesync slots to cleanup.
>

Okay, then can we think of combining with the existing error of the
replication slot? I think that might produce a very long message, so
another idea could be to LOG a separate WARNING for each such slot
just before giving the error.

> > 2. In AlterSubscription_refresh(), we can't allow workers to be
> > stopped at commit time as we have already dropped the slots because
> > the worker can access the dropped slot. We need to stop the workers
> > before dropping slots. This makes all the code related to
> > logicalrep_worker_stop_at_commit redundant.
>
> OK.
>
> > 3. In AlterSubscription_refresh(), we need to acquire the lock on
> > pg_subscription_rel only when we try to remove any subscription rel.
>
> + if (!sub_rel_locked)
> + {
> + rel = table_open(SubscriptionRelRelationId, AccessExclusiveLock);
> + sub_rel_locked = true;
> + }
>
> OK. But the sub_rel_locked bool is not really necessary. Why not just
> check for rel == NULL? e.g.
> if (!rel)
> rel = table_open(SubscriptionRelRelationId, AccessExclusiveLock);
>

Okay, that seems to be better, will change accordingly.

> > 4. Added/Changed quite a few comments.
> >
>
> @@ -1042,6 +1115,31 @@ DropSubscription(DropSubscriptionStmt *stmt,
> bool isTopLevel)
>   }
>   list_free(subworkers);
>
> + /*
> + * Tablesync resource cleanup (slots and origins).
>
> The comment is misleading; this code is only dropping origins.
>

Okay, will change to something like: "Cleanup of tablesync replication origins."

> @@ -73,20 +73,6 @@ typedef struct LogicalRepWorkerId
>   Oid relid;
>  } LogicalRepWorkerId;
>
> -typedef struct StopWorkersData
> -{
> - int nestDepth; /* Sub-transaction nest level */
> - List*workers; /* List of LogicalRepWorkerId */
> - struct StopWorkersData *parent; /* This need not be an immediate
> - * subtransaction parent */
> -} StopWorkersData;
>
> Since v23 removes that typedef from the code, don't you also have to
> remove it from src/tools/pgindent/typedefs.list?
>

Yeah.

-- 
With Regards,
Amit Kapila.




Re: Single transaction in the tablesync worker?

2021-01-31 Thread Peter Smith
On Sun, Jan 31, 2021 at 12:19 AM Amit Kapila  wrote:

> 2. In AlterSubscription_refresh(), we can't allow workers to be
> stopped at commit time as we have already dropped the slots because
> the worker can access the dropped slot. We need to stop the workers
> before dropping slots. This makes all the code related to
> logicalrep_worker_stop_at_commit redundant.

@@ -73,20 +73,6 @@ typedef struct LogicalRepWorkerId
  Oid relid;
 } LogicalRepWorkerId;

-typedef struct StopWorkersData
-{
- int nestDepth; /* Sub-transaction nest level */
- List*workers; /* List of LogicalRepWorkerId */
- struct StopWorkersData *parent; /* This need not be an immediate
- * subtransaction parent */
-} StopWorkersData;

Since v23 removes that typedef from the code, don't you also have to
remove it from src/tools/pgindent/typedefs.list?


Kind Regards,
Peter Smith.
Fujitsu Australia




Re: Single transaction in the tablesync worker?

2021-01-31 Thread Peter Smith
On Sun, Jan 31, 2021 at 12:19 AM Amit Kapila  wrote:
>
> I have made the below changes in the patch. Let me know what you think
> about these?
> 1. It was a bit difficult to understand the code in DropSubscription
> so I have rearranged the code to match the way we are doing in HEAD
> where we drop the slots at the end after finishing all the other
> cleanup.

There was a reason why the v22 logic was different from HEAD.

The broken connection leaves dangling slots which is unavoidable. But,
whereas the user knows the name of the Subscription slot (they named
it), there is no easy way for them to know the names of the remaining
tablesync slots unless we log them.

That is why the v22 code was written to process the tablesync slots
even for wrconn == NULL so their name could be logged:
elog(WARNING, "no connection; cannot drop tablesync slot \"%s\".",
syncslotname);

The v23 patch removed this dangling slot name information, so it makes
it difficult for the user to know what tablesync slots to cleanup.

> 2. In AlterSubscription_refresh(), we can't allow workers to be
> stopped at commit time as we have already dropped the slots because
> the worker can access the dropped slot. We need to stop the workers
> before dropping slots. This makes all the code related to
> logicalrep_worker_stop_at_commit redundant.

OK.

> 3. In AlterSubscription_refresh(), we need to acquire the lock on
> pg_subscription_rel only when we try to remove any subscription rel.

+ if (!sub_rel_locked)
+ {
+ rel = table_open(SubscriptionRelRelationId, AccessExclusiveLock);
+ sub_rel_locked = true;
+ }

OK. But the sub_rel_locked bool is not really necessary. Why not just
check for rel == NULL? e.g.
if (!rel)
rel = table_open(SubscriptionRelRelationId, AccessExclusiveLock);

> 4. Added/Changed quite a few comments.
>

@@ -1042,6 +1115,31 @@ DropSubscription(DropSubscriptionStmt *stmt,
bool isTopLevel)
  }
  list_free(subworkers);

+ /*
+ * Tablesync resource cleanup (slots and origins).

The comment is misleading; this code is only dropping origins.


Kind Regards,
Peter Smith.
Fujitsu Australia




Re: Single transaction in the tablesync worker?

2021-01-30 Thread Amit Kapila
On Fri, Jan 29, 2021 at 4:07 PM Peter Smith  wrote:
>
>
> Differences from v21:
> + Patch is rebased to latest OSS HEAD @ 29/Jan.
> + Includes new code as suggested [ak0128] to ensure no dangling slots
> at Drop/AlterSubscription.
> + Removes the slot/origin cleanup down by process interrupt logic
> (cleanup_at_shutdown function).
> + Addresses some minor review comments.
>

I have made the below changes in the patch. Let me know what you think
about these?
1. It was a bit difficult to understand the code in DropSubscription
so I have rearranged the code to match the way we are doing in HEAD
where we drop the slots at the end after finishing all the other
cleanup.
2. In AlterSubscription_refresh(), we can't allow workers to be
stopped at commit time as we have already dropped the slots because
the worker can access the dropped slot. We need to stop the workers
before dropping slots. This makes all the code related to
logicalrep_worker_stop_at_commit redundant.
3. In AlterSubscription_refresh(), we need to acquire the lock on
pg_subscription_rel only when we try to remove any subscription rel.
4. Added/Changed quite a few comments.

-- 
With Regards,
Amit Kapila.


v23-0001-Tablesync-Solution1.patch
Description: Binary data


Re: Single transaction in the tablesync worker?

2021-01-29 Thread Peter Smith
On Thu, Jan 28, 2021 at 9:37 PM Amit Kapila  wrote:
>
> On Thu, Jan 28, 2021 at 12:32 PM Peter Smith  wrote:
> >
> > On Wed, Jan 27, 2021 at 2:53 PM Amit Kapila  wrote:
> > >
> > > On Sat, Jan 23, 2021 at 5:56 PM Amit Kapila  
> > > wrote:
> > > >
> > > > On Sat, Jan 23, 2021 at 4:55 AM Peter Smith  
> > > > wrote:
> > > > >
> > > > > PSA the v18 patch for the Tablesync Solution1.
> > > >
> > > > 7. Have you tested with the new patch the scenario where we crash
> > > > after FINISHEDCOPY and before SYNCDONE, is it able to pick up the
> > > > replication using the new temporary slot? Here, we need to test the
> > > > case where during the catchup phase we have received few commits and
> > > > then the tablesync worker is crashed/errored out? Basically, check if
> > > > the replication is continued from the same point?
> > > >
> > >
> > > I have tested this and it didn't work, see the below example.
> > >
> > > Publisher-side
> > > 
> > > CREATE TABLE mytbl1(id SERIAL PRIMARY KEY, somedata int, text 
> > > varchar(120));
> > >
> > > BEGIN;
> > > INSERT INTO mytbl1(somedata, text) VALUES (1, 1);
> > > INSERT INTO mytbl1(somedata, text) VALUES (1, 2);
> > > COMMIT;
> > >
> > > CREATE PUBLICATION mypublication FOR TABLE mytbl1;
> > >
> > > Subscriber-side
> > > 
> > > - Have a while(1) loop in LogicalRepSyncTableStart so that tablesync
> > > worker stops.
> > >
> > > CREATE TABLE mytbl1(id SERIAL PRIMARY KEY, somedata int, text 
> > > varchar(120));
> > >
> > >
> > > CREATE SUBSCRIPTION mysub
> > >  CONNECTION 'host=localhost port=5432 dbname=postgres'
> > > PUBLICATION mypublication;
> > >
> > > During debug, stop after we mark FINISHEDCOPY state.
> > >
> > > Publisher-side
> > > 
> > > INSERT INTO mytbl1(somedata, text) VALUES (1, 3);
> > > INSERT INTO mytbl1(somedata, text) VALUES (1, 4);
> > >
> > >
> > > Subscriber-side
> > > 
> > > - Have a breakpoint in apply_dispatch
> > > - continue in debugger;
> > > - After we replay first commit (which will be for values(1,3), note
> > > down the origin position in apply_handle_commit_internal and somehow
> > > error out. I have forced the debugger to set to the last line in
> > > apply_dispatch where the error is raised.
> > > - After the error, again the tablesync worker is restarted and it
> > > starts from the position noted in the previous step
> > > - It exits without replaying the WAL for (1,4)
> > >
> > > So, on the subscriber-side, you will see 3 records. Fourth is missing.
> > > Now, if you insert more records on the publisher, it will anyway
> > > replay those but the fourth one got missing.
> > >
> ...
> > >
> > > At this point, I can't think of any way to fix this problem except for
> > > going back to the previous approach of permanent slots but let me know
> > > if you have any ideas to salvage this approach?
> > >
> >
> > OK. The latest patch [v21] now restores the permanent slot (and slot
> > cleanup) approach as it was implemented in an earlier version [v17].
> > Please note that this change also re-introduces some potential slot
> > cleanup problems for some race scenarios.
> >
>
> I am able to reproduce the race condition where slot/origin will
> remain on the publisher node even when the corresponding subscription
> is dropped. Basically, if we error out in the 'catchup' phase in
> tablesync worker then either it will restart and cleanup slot/origin
> or if in the meantime we have dropped the subscription and stopped
> apply worker then probably the slot and origin will be dangling on the
> publisher.
>
> I have used exactly the same test procedure as was used to expose the
> problem in the temporary slots with some minor changes as mentioned
> below:
> Subscriber-side
> 
> - Have a while(1) loop in LogicalRepSyncTableStart so that tablesync
> worker stops.
> - Have a while(1) loop in wait_for_relation_state_change so that we
> can control apply worker via debugger at the right time.
>
> Subscriber-side
> 
> - Have a breakpoint in apply_dispatch
> - continue in debugger;
> - After we replay first commit somehow error out. I have forced the
> debugger to set to the last line in apply_dispatch where the error is
> raised.
> - Now, the table sync worker won't restart because the apply worker is
> looping in wait_for_relation_state_change.
> - Execute DropSubscription;
> - We can allow apply worker to continue by skipping the while(1) and
> it will exit because DropSubscription would have sent a terminate
> signal.
>
> After the above steps, check the publisher (select * from
> pg_replication_slots) and you will find the dangling tablesync slot.
>
> I think to solve the above problem we should drop tablesync
> slot/origin at the Drop/Alter Subscription time and additionally we
> need to ensure that apply worker doesn't let tablesync workers restart
> (or it must not do any work to access the slot because the slots are
> dropped) once we 

Re: Single transaction in the tablesync worker?

2021-01-29 Thread Peter Smith
Hi Amit.

PSA the v22 patch for the Tablesync Solution1.

Differences from v21:
+ Patch is rebased to latest OSS HEAD @ 29/Jan.
+ Includes new code as suggested [ak0128] to ensure no dangling slots
at Drop/AlterSubscription.
+ Removes the slot/origin cleanup down by process interrupt logic
(cleanup_at_shutdown function).
+ Addresses some minor review comments.


[ak0128] 
https://www.postgresql.org/message-id/CAA4eK1LMYXZY1SpzgW-WyFdy%2BFTMZ4BMz1dj0rT2rxGv-zLwFA%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia


v22-0001-Tablesync-Solution1.patch
Description: Binary data


v22-0002-Tablesync-extra-logging.patch
Description: Binary data


Re: Single transaction in the tablesync worker?

2021-01-28 Thread Amit Kapila
On Thu, Jan 28, 2021 at 12:32 PM Peter Smith  wrote:
>
> On Wed, Jan 27, 2021 at 2:53 PM Amit Kapila  wrote:
> >
> > On Sat, Jan 23, 2021 at 5:56 PM Amit Kapila  wrote:
> > >
> > > On Sat, Jan 23, 2021 at 4:55 AM Peter Smith  wrote:
> > > >
> > > > PSA the v18 patch for the Tablesync Solution1.
> > >
> > > 7. Have you tested with the new patch the scenario where we crash
> > > after FINISHEDCOPY and before SYNCDONE, is it able to pick up the
> > > replication using the new temporary slot? Here, we need to test the
> > > case where during the catchup phase we have received few commits and
> > > then the tablesync worker is crashed/errored out? Basically, check if
> > > the replication is continued from the same point?
> > >
> >
> > I have tested this and it didn't work, see the below example.
> >
> > Publisher-side
> > 
> > CREATE TABLE mytbl1(id SERIAL PRIMARY KEY, somedata int, text varchar(120));
> >
> > BEGIN;
> > INSERT INTO mytbl1(somedata, text) VALUES (1, 1);
> > INSERT INTO mytbl1(somedata, text) VALUES (1, 2);
> > COMMIT;
> >
> > CREATE PUBLICATION mypublication FOR TABLE mytbl1;
> >
> > Subscriber-side
> > 
> > - Have a while(1) loop in LogicalRepSyncTableStart so that tablesync
> > worker stops.
> >
> > CREATE TABLE mytbl1(id SERIAL PRIMARY KEY, somedata int, text varchar(120));
> >
> >
> > CREATE SUBSCRIPTION mysub
> >  CONNECTION 'host=localhost port=5432 dbname=postgres'
> > PUBLICATION mypublication;
> >
> > During debug, stop after we mark FINISHEDCOPY state.
> >
> > Publisher-side
> > 
> > INSERT INTO mytbl1(somedata, text) VALUES (1, 3);
> > INSERT INTO mytbl1(somedata, text) VALUES (1, 4);
> >
> >
> > Subscriber-side
> > 
> > - Have a breakpoint in apply_dispatch
> > - continue in debugger;
> > - After we replay first commit (which will be for values(1,3), note
> > down the origin position in apply_handle_commit_internal and somehow
> > error out. I have forced the debugger to set to the last line in
> > apply_dispatch where the error is raised.
> > - After the error, again the tablesync worker is restarted and it
> > starts from the position noted in the previous step
> > - It exits without replaying the WAL for (1,4)
> >
> > So, on the subscriber-side, you will see 3 records. Fourth is missing.
> > Now, if you insert more records on the publisher, it will anyway
> > replay those but the fourth one got missing.
> >
...
> >
> > At this point, I can't think of any way to fix this problem except for
> > going back to the previous approach of permanent slots but let me know
> > if you have any ideas to salvage this approach?
> >
>
> OK. The latest patch [v21] now restores the permanent slot (and slot
> cleanup) approach as it was implemented in an earlier version [v17].
> Please note that this change also re-introduces some potential slot
> cleanup problems for some race scenarios.
>

I am able to reproduce the race condition where slot/origin will
remain on the publisher node even when the corresponding subscription
is dropped. Basically, if we error out in the 'catchup' phase in
tablesync worker then either it will restart and cleanup slot/origin
or if in the meantime we have dropped the subscription and stopped
apply worker then probably the slot and origin will be dangling on the
publisher.

I have used exactly the same test procedure as was used to expose the
problem in the temporary slots with some minor changes as mentioned
below:
Subscriber-side

- Have a while(1) loop in LogicalRepSyncTableStart so that tablesync
worker stops.
- Have a while(1) loop in wait_for_relation_state_change so that we
can control apply worker via debugger at the right time.

Subscriber-side

- Have a breakpoint in apply_dispatch
- continue in debugger;
- After we replay first commit somehow error out. I have forced the
debugger to set to the last line in apply_dispatch where the error is
raised.
- Now, the table sync worker won't restart because the apply worker is
looping in wait_for_relation_state_change.
- Execute DropSubscription;
- We can allow apply worker to continue by skipping the while(1) and
it will exit because DropSubscription would have sent a terminate
signal.

After the above steps, check the publisher (select * from
pg_replication_slots) and you will find the dangling tablesync slot.

I think to solve the above problem we should drop tablesync
slot/origin at the Drop/Alter Subscription time and additionally we
need to ensure that apply worker doesn't let tablesync workers restart
(or it must not do any work to access the slot because the slots are
dropped) once we stopped them. To ensure that, I think we need to make
the following changes:

1. Take AccessExclusivelock on subscription_rel during Alter (before
calling RemoveSubscriptionRel) and don't release it till transaction
end (do table_close with NoLock) similar to DropSubscription.
2. Take share lock (AccessShareLock) in 

Re: Single transaction in the tablesync worker?

2021-01-27 Thread Peter Smith
On Wed, Jan 27, 2021 at 2:53 PM Amit Kapila  wrote:
>
> On Sat, Jan 23, 2021 at 5:56 PM Amit Kapila  wrote:
> >
> > On Sat, Jan 23, 2021 at 4:55 AM Peter Smith  wrote:
> > >
> > > PSA the v18 patch for the Tablesync Solution1.
> >
> > 7. Have you tested with the new patch the scenario where we crash
> > after FINISHEDCOPY and before SYNCDONE, is it able to pick up the
> > replication using the new temporary slot? Here, we need to test the
> > case where during the catchup phase we have received few commits and
> > then the tablesync worker is crashed/errored out? Basically, check if
> > the replication is continued from the same point?
> >
>
> I have tested this and it didn't work, see the below example.
>
> Publisher-side
> 
> CREATE TABLE mytbl1(id SERIAL PRIMARY KEY, somedata int, text varchar(120));
>
> BEGIN;
> INSERT INTO mytbl1(somedata, text) VALUES (1, 1);
> INSERT INTO mytbl1(somedata, text) VALUES (1, 2);
> COMMIT;
>
> CREATE PUBLICATION mypublication FOR TABLE mytbl1;
>
> Subscriber-side
> 
> - Have a while(1) loop in LogicalRepSyncTableStart so that tablesync
> worker stops.
>
> CREATE TABLE mytbl1(id SERIAL PRIMARY KEY, somedata int, text varchar(120));
>
>
> CREATE SUBSCRIPTION mysub
>  CONNECTION 'host=localhost port=5432 dbname=postgres'
> PUBLICATION mypublication;
>
> During debug, stop after we mark FINISHEDCOPY state.
>
> Publisher-side
> 
> INSERT INTO mytbl1(somedata, text) VALUES (1, 3);
> INSERT INTO mytbl1(somedata, text) VALUES (1, 4);
>
>
> Subscriber-side
> 
> - Have a breakpoint in apply_dispatch
> - continue in debugger;
> - After we replay first commit (which will be for values(1,3), note
> down the origin position in apply_handle_commit_internal and somehow
> error out. I have forced the debugger to set to the last line in
> apply_dispatch where the error is raised.
> - After the error, again the tablesync worker is restarted and it
> starts from the position noted in the previous step
> - It exits without replaying the WAL for (1,4)
>
> So, on the subscriber-side, you will see 3 records. Fourth is missing.
> Now, if you insert more records on the publisher, it will anyway
> replay those but the fourth one got missing.
>
> The temporary slots didn't seem to work because we created again the
> new temporary slot after the crash and ask it to start decoding from
> the point we noted in origin_lsn. The publisher didn’t hold the
> required WAL as our slot was temporary so it started sending from some
> later point. We retain WAL based on the slots restart_lsn position and
> wal_keep_size. For our case, the positions of the slots will matter
> and as we have created temporary slots, there is no way for a
> publisher to save that WAL.
>
> In this particular case, even if the WAL would have been there we only
> pass the start_decoding_at position but didn’t pass restart_lsn, so it
> picked a random location (current insert position in WAL) which is
> ahead of start_decoding_at point so it never sent the required fourth
> record. Now, I don’t think it will work even if somehow sent the
> correct restart_lsn because of what I wrote earlier that there is no
> guarantee that the earlier WAL would have been saved.
>
> At this point, I can't think of any way to fix this problem except for
> going back to the previous approach of permanent slots but let me know
> if you have any ideas to salvage this approach?
>

OK. The latest patch [v21] now restores the permanent slot (and slot
cleanup) approach as it was implemented in an earlier version [v17].
Please note that this change also re-introduces some potential slot
cleanup problems for some race scenarios. These will be addressed by
future patches.


[v17] 
https://www.postgresql.org/message-id/CAHut%2BPt9%2Bg8qQR0kMC85nY-O4uDQxXboamZAYhHbvkebzC9fAQ%40mail.gmail.com
[v21] 
https://www.postgresql.org/message-id/CAHut%2BPvzHRRA_5O0R8KZCb1tVe1mBVPxFtmttXJnmuOmAegoWA%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia




Re: Single transaction in the tablesync worker?

2021-01-27 Thread Peter Smith
Hi Amit.

PSA the v21 patch for the Tablesync Solution1.

Main differences from v20:
+ Rebased to latest OSS HEAD @ 27/Jan
+ v21 is a merging of patches [v17] and [v20], which was made
necessary when it was found [ak0127] that the v20 usage of TEMPORARY
tablesync slots did not work correctly. v21 reverts to using PERMANENT
tablesync slots same as implemented in v17, while retaining other
review comment fixes made for v18, v19, v20.


[v17] 
https://www.postgresql.org/message-id/CAHut%2BPt9%2Bg8qQR0kMC85nY-O4uDQxXboamZAYhHbvkebzC9fAQ%40mail.gmail.com
[v20] 
https://www.postgresql.org/message-id/CAHut%2BPuNwSujoL_dwa%3DTtozJ_vF%3DCnJxjgQTCmNBkazd8J1m-A%40mail.gmail.com
[ak0127] 
https://www.postgresql.org/message-id/CAA4eK1LDsj9kw4FbWAw3CMHyVsjafgDum03cYy-wpGmor%3D8-1w%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia


v21-0002-Tablesync-extra-logging.patch
Description: Binary data


v21-0001-Tablesync-Solution1.patch
Description: Binary data


Re: Single transaction in the tablesync worker?

2021-01-26 Thread Amit Kapila
On Sat, Jan 23, 2021 at 5:56 PM Amit Kapila  wrote:
>
> On Sat, Jan 23, 2021 at 4:55 AM Peter Smith  wrote:
> >
> > PSA the v18 patch for the Tablesync Solution1.
>
> 7. Have you tested with the new patch the scenario where we crash
> after FINISHEDCOPY and before SYNCDONE, is it able to pick up the
> replication using the new temporary slot? Here, we need to test the
> case where during the catchup phase we have received few commits and
> then the tablesync worker is crashed/errored out? Basically, check if
> the replication is continued from the same point?
>

I have tested this and it didn't work, see the below example.

Publisher-side

CREATE TABLE mytbl1(id SERIAL PRIMARY KEY, somedata int, text varchar(120));

BEGIN;
INSERT INTO mytbl1(somedata, text) VALUES (1, 1);
INSERT INTO mytbl1(somedata, text) VALUES (1, 2);
COMMIT;

CREATE PUBLICATION mypublication FOR TABLE mytbl1;

Subscriber-side

- Have a while(1) loop in LogicalRepSyncTableStart so that tablesync
worker stops.

CREATE TABLE mytbl1(id SERIAL PRIMARY KEY, somedata int, text varchar(120));


CREATE SUBSCRIPTION mysub
 CONNECTION 'host=localhost port=5432 dbname=postgres'
PUBLICATION mypublication;

During debug, stop after we mark FINISHEDCOPY state.

Publisher-side

INSERT INTO mytbl1(somedata, text) VALUES (1, 3);
INSERT INTO mytbl1(somedata, text) VALUES (1, 4);


Subscriber-side

- Have a breakpoint in apply_dispatch
- continue in debugger;
- After we replay first commit (which will be for values(1,3), note
down the origin position in apply_handle_commit_internal and somehow
error out. I have forced the debugger to set to the last line in
apply_dispatch where the error is raised.
- After the error, again the tablesync worker is restarted and it
starts from the position noted in the previous step
- It exits without replaying the WAL for (1,4)

So, on the subscriber-side, you will see 3 records. Fourth is missing.
Now, if you insert more records on the publisher, it will anyway
replay those but the fourth one got missing.

The temporary slots didn't seem to work because we created again the
new temporary slot after the crash and ask it to start decoding from
the point we noted in origin_lsn. The publisher didn’t hold the
required WAL as our slot was temporary so it started sending from some
later point. We retain WAL based on the slots restart_lsn position and
wal_keep_size. For our case, the positions of the slots will matter
and as we have created temporary slots, there is no way for a
publisher to save that WAL.

In this particular case, even if the WAL would have been there we only
pass the start_decoding_at position but didn’t pass restart_lsn, so it
picked a random location (current insert position in WAL) which is
ahead of start_decoding_at point so it never sent the required fourth
record. Now, I don’t think it will work even if somehow sent the
correct restart_lsn because of what I wrote earlier that there is no
guarantee that the earlier WAL would have been saved.

At this point, I can't think of any way to fix this problem except for
going back to the previous approach of permanent slots but let me know
if you have any ideas to salvage this approach?

-- 
With Regards,
Amit Kapila.




Re: Single transaction in the tablesync worker?

2021-01-26 Thread Peter Smith
On Mon, Jan 25, 2021 at 4:48 PM Amit Kapila  wrote:
>
> On Mon, Jan 25, 2021 at 8:03 AM Peter Smith  wrote:
> >
> > Hi Amit.
> >
> > PSA the v19 patch for the Tablesync Solution1.
> >
>
> I see one race condition in this patch where we try to drop the origin
> via apply process and DropSubscription. I think it can lead to the
> error "cache lookup failed for replication origin with oid %u". The
> same problem can happen via exposed API pg_replication_origin_drop but
> probably because this is not used concurrently so nobody faced this
> issue. I think for the matter of this patch we can try to suppress
> such an error either via try..catch, or by adding missing_ok argument
> to replorigin_drop API, or we can just add to comments that such a
> race exists. Additionally, we should try to start a new thread for the
> existence of this problem in pg_replication_origin_drop. What do you
> think?

OK. A new thread [ps0127] for this problem was started

---
[ps0127] = 
https://www.postgresql.org/message-id/CAHut%2BPuW8DWV5fskkMWWMqzt-x7RPcNQOtJQBp6SdwyRghCk7A%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia




Re: Single transaction in the tablesync worker?

2021-01-25 Thread Peter Smith
On Mon, Jan 25, 2021 at 4:48 PM Amit Kapila  wrote:
>
> On Mon, Jan 25, 2021 at 8:03 AM Peter Smith  wrote:
> >
> > Hi Amit.
> >
> > PSA the v19 patch for the Tablesync Solution1.
> >
>
> I see one race condition in this patch where we try to drop the origin
> via apply process and DropSubscription. I think it can lead to the
> error "cache lookup failed for replication origin with oid %u". The
> same problem can happen via exposed API pg_replication_origin_drop but
> probably because this is not used concurrently so nobody faced this
> issue. I think for the matter of this patch we can try to suppress
> such an error either via try..catch, or by adding missing_ok argument
> to replorigin_drop API, or we can just add to comments that such a
> race exists.

OK. This has been isolated to a common function called from 3 places.
The potential race ERROR is suppressed by TRY/CATCH.
Please see code of latest patch [v20]

> Additionally, we should try to start a new thread for the
> existence of this problem in pg_replication_origin_drop. What do you
> think?

OK. It is on my TODO list..


[v20] 
https://www.postgresql.org/message-id/CAHut%2BPuNwSujoL_dwa%3DTtozJ_vF%3DCnJxjgQTCmNBkazd8J1m-A%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia




Re: Single transaction in the tablesync worker?

2021-01-25 Thread Peter Smith
On Mon, Jan 25, 2021 at 2:54 PM Amit Kapila  wrote:
>
> On Mon, Jan 25, 2021 at 8:23 AM Peter Smith  wrote:
> >
> > On Sat, Jan 23, 2021 at 11:26 PM Amit Kapila  
> > wrote:
> > >
> > > 2.
> > > @@ -98,11 +102,16 @@
> > >  #include "miscadmin.h"
> > >  #include "parser/parse_relation.h"
> > >  #include "pgstat.h"
> > > +#include "postmaster/interrupt.h"
> > >  #include "replication/logicallauncher.h"
> > >  #include "replication/logicalrelation.h"
> > > +#include "replication/logicalworker.h"
> > >  #include "replication/walreceiver.h"
> > >  #include "replication/worker_internal.h"
> > > +#include "replication/slot.h"
> > >
> > > I don't think the above includes are required. They seem to the
> > > remnant of the previous approach.
> > >
> >
> > OK. Fixed in the latest patch [v19].
> >
>
> You seem to forgot removing #include "replication/slot.h". Check, if
> it is not required then remove that as well.
>

Fixed in the latest patch [v20].


[v20] 
https://www.postgresql.org/message-id/CAHut%2BPuNwSujoL_dwa%3DTtozJ_vF%3DCnJxjgQTCmNBkazd8J1m-A%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia




Re: Single transaction in the tablesync worker?

2021-01-25 Thread Peter Smith
On Mon, Jan 25, 2021 at 1:58 PM Amit Kapila  wrote:
>
> On Sun, Jan 24, 2021 at 12:24 PM Peter Smith  wrote:
> >
> > On Sat, Jan 23, 2021 at 11:26 PM Amit Kapila  
> > wrote:
> > >
> > >
> > > Few comments:
> > > =
> > > 1.
> > > - *   So the state progression is always: INIT -> DATASYNC -> SYNCWAIT ->
> > > - *   CATCHUP -> SYNCDONE -> READY.
> > > + *   So the state progression is always: INIT -> DATASYNC ->
> > > + *   (sync worker FINISHEDCOPY) -> SYNCWAIT -> CATCHUP -> SYNCDONE -> 
> > > READY.
> > >
> > > I don't think we need to be specific here that sync worker sets
> > > FINISHEDCOPY state.
> > >
> >
> > This was meant to indicate that *only* the sync worker knows about the
> > FINISHEDCOPY state, whereas all the other states are either known
> > (assigned and/or used) by *both* kinds of workers. But, I can remove
> > it if you feel that distinction is not useful.
> >
>
> Okay, but I feel you can mention that in the description you have
> added for FINISHEDCOPY state. It looks a bit odd here and the message
> you want to convey is also not that clear.
>

The comment is updated in the latest patch [v20].


[v20] 
https://www.postgresql.org/message-id/CAHut%2BPuNwSujoL_dwa%3DTtozJ_vF%3DCnJxjgQTCmNBkazd8J1m-A%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia




Re: Single transaction in the tablesync worker?

2021-01-25 Thread Peter Smith
On Thu, Jan 21, 2021 at 9:17 PM Amit Kapila  wrote:
> 7.
> +# check for occurrence of the expected error
> +poll_output_until("replication slot \"$slotname\" already exists")
> +or die "no error stop for the pre-existing origin";
>
> In this test, isn't it better to check for datasync state like below?
> 004_sync.pl has some other similar test.
> my $started_query = "SELECT srsubstate = 'd' FROM pg_subscription_rel;";
> $node_subscriber->poll_query_until('postgres', $started_query)
>   or die "Timed out while waiting for subscriber to start sync";
>
> Is there a reason why we can't use the existing way to check for
> failure in this case?
>

The TAP test is updated in the latest patch [v20].


[v20] 
https://www.postgresql.org/message-id/CAHut%2BPuNwSujoL_dwa%3DTtozJ_vF%3DCnJxjgQTCmNBkazd8J1m-A%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia




Re: Single transaction in the tablesync worker?

2021-01-25 Thread Peter Smith
Hi Amit.

PSA the v20 patch for the Tablesync Solution1.

Main differences from v19:
+ Updated TAP test [ak0123-7]
+ Fixed comment [ak0125-1]
+ Removed redundant header [ak0125-2]
+ Protection against race condition [ak0125-race]

[ak0123-7] 
https://www.postgresql.org/message-id/CAA4eK1JhpuwujrV6ABMmZ3jXfW37ssZnJ3fikrY7rRdvoEmu_g%40mail.gmail.com
[ak0125-1] 
https://www.postgresql.org/message-id/CAA4eK1JmP2VVpH2%3DO%3D5BBbuH7gyQtWn40aXp_Jyjn1%2BKggfq8A%40mail.gmail.com
[ak0125-2] 
https://www.postgresql.org/message-id/CAA4eK1L1j5sfBgHb0-H-%2B2quBstsA3hMcDfP-4vLuU-UF43nXQ%40mail.gmail.com
[ak0125-race] 
https://www.postgresql.org/message-id/CAA4eK1%2ByeLwBCkTvTdPM-hSk1fr6jT8KJc362CN8zrGztq_JqQ%40mail.gmail.com



Features:

* The tablesync worker is now allowing multiple tx instead of single tx.

* A new state (SUBREL_STATE_FINISHEDCOPY) is persisted after a
successful copy_table in tablesync's LogicalRepSyncTableStart.

* If a re-launched tablesync finds state SUBREL_STATE_FINISHEDCOPY
then it will bypass the initial copy_table phase.

* Now tablesync sets up replication origin tracking in
LogicalRepSyncTableStart (similar as done for the apply worker). The
origin is advanced when first created.

* The tablesync replication origin tracking record is cleaned up by:
- process_syncing_tables_for_apply
- DropSubscription
- AlterSubscription_refresh

* Updates to PG docs.

* New TAP test case.

Known Issues:

* Some records arriving between FINISHEDCOPY and SYNCDONE state may be
lost (currently under investigation).

---

Kind Regards,
Peter Smith.
Fujitsu Australia


v20-0002-Tablesync-extra-logging.patch
Description: Binary data


v20-0001-Tablesync-Solution1.patch
Description: Binary data


Re: Single transaction in the tablesync worker?

2021-01-24 Thread Amit Kapila
On Mon, Jan 25, 2021 at 8:03 AM Peter Smith  wrote:
>
> Hi Amit.
>
> PSA the v19 patch for the Tablesync Solution1.
>

I see one race condition in this patch where we try to drop the origin
via apply process and DropSubscription. I think it can lead to the
error "cache lookup failed for replication origin with oid %u". The
same problem can happen via exposed API pg_replication_origin_drop but
probably because this is not used concurrently so nobody faced this
issue. I think for the matter of this patch we can try to suppress
such an error either via try..catch, or by adding missing_ok argument
to replorigin_drop API, or we can just add to comments that such a
race exists. Additionally, we should try to start a new thread for the
existence of this problem in pg_replication_origin_drop. What do you
think?

-- 
With Regards,
Amit Kapila.




Re: Single transaction in the tablesync worker?

2021-01-24 Thread Amit Kapila
On Mon, Jan 25, 2021 at 8:23 AM Peter Smith  wrote:
>
> On Sat, Jan 23, 2021 at 11:26 PM Amit Kapila  wrote:
> >
> > 2.
> > @@ -98,11 +102,16 @@
> >  #include "miscadmin.h"
> >  #include "parser/parse_relation.h"
> >  #include "pgstat.h"
> > +#include "postmaster/interrupt.h"
> >  #include "replication/logicallauncher.h"
> >  #include "replication/logicalrelation.h"
> > +#include "replication/logicalworker.h"
> >  #include "replication/walreceiver.h"
> >  #include "replication/worker_internal.h"
> > +#include "replication/slot.h"
> >
> > I don't think the above includes are required. They seem to the
> > remnant of the previous approach.
> >
>
> OK. Fixed in the latest patch [v19].
>

You seem to forgot removing #include "replication/slot.h". Check, if
it is not required then remove that as well.

-- 
With Regards,
Amit Kapila.




Re: Single transaction in the tablesync worker?

2021-01-24 Thread Amit Kapila
On Sat, Jan 23, 2021 at 11:08 AM Ajin Cherian  wrote:
>
> On Sat, Jan 23, 2021 at 3:16 PM Amit Kapila  wrote:
>
> >
> > I think so. But do you have any reason to believe that it won't be
> > required anymore?
>
> A temporary slot will not clash with a permanent slot of the same name,
>

I have tried below and it seems to be clashing:
postgres=# SELECT 'init' FROM
pg_create_logical_replication_slot('test_slot2', 'test_decoding');
 ?column?
--
 init
(1 row)

postgres=# SELECT 'init' FROM
pg_create_logical_replication_slot('test_slot2', 'test_decoding',
true);
ERROR:  replication slot "test_slot2" already exists

Note that the third parameter in the second statement above indicates
whether it is a temporary slot or not. What am I missing?
-- 
With Regards,
Amit Kapila.




Re: Single transaction in the tablesync worker?

2021-01-24 Thread Amit Kapila
On Sun, Jan 24, 2021 at 12:24 PM Peter Smith  wrote:
>
> On Sat, Jan 23, 2021 at 11:26 PM Amit Kapila  wrote:
> >
> >
> > Few comments:
> > =
> > 1.
> > - *   So the state progression is always: INIT -> DATASYNC -> SYNCWAIT ->
> > - *   CATCHUP -> SYNCDONE -> READY.
> > + *   So the state progression is always: INIT -> DATASYNC ->
> > + *   (sync worker FINISHEDCOPY) -> SYNCWAIT -> CATCHUP -> SYNCDONE -> 
> > READY.
> >
> > I don't think we need to be specific here that sync worker sets
> > FINISHEDCOPY state.
> >
>
> This was meant to indicate that *only* the sync worker knows about the
> FINISHEDCOPY state, whereas all the other states are either known
> (assigned and/or used) by *both* kinds of workers. But, I can remove
> it if you feel that distinction is not useful.
>

Okay, but I feel you can mention that in the description you have
added for FINISHEDCOPY state. It looks a bit odd here and the message
you want to convey is also not that clear.

-- 
With Regards,
Amit Kapila.




Re: Single transaction in the tablesync worker?

2021-01-24 Thread Peter Smith
On Sat, Jan 23, 2021 at 11:26 PM Amit Kapila  wrote:
>
> 2.
> @@ -98,11 +102,16 @@
>  #include "miscadmin.h"
>  #include "parser/parse_relation.h"
>  #include "pgstat.h"
> +#include "postmaster/interrupt.h"
>  #include "replication/logicallauncher.h"
>  #include "replication/logicalrelation.h"
> +#include "replication/logicalworker.h"
>  #include "replication/walreceiver.h"
>  #include "replication/worker_internal.h"
> +#include "replication/slot.h"
>
> I don't think the above includes are required. They seem to the
> remnant of the previous approach.
>

OK. Fixed in the latest patch [v19].

> 3.
>  process_syncing_tables_for_sync(XLogRecPtr current_lsn)
>  {
> - Assert(IsTransactionState());
> + bool sync_done = false;
>
>   SpinLockAcquire(>relmutex);
> + sync_done = MyLogicalRepWorker->relstate == SUBREL_STATE_CATCHUP &&
> + current_lsn >= MyLogicalRepWorker->relstate_lsn;
> + SpinLockRelease(>relmutex);
>
> - if (MyLogicalRepWorker->relstate == SUBREL_STATE_CATCHUP &&
> - current_lsn >= MyLogicalRepWorker->relstate_lsn)
> + if (sync_done)
>   {
>   TimeLineID tli;
>
> + /*
> + * Change state to SYNCDONE.
> + */
> + SpinLockAcquire(>relmutex);
>
> Why do we need these changes? If you have done it for the
> code-readability purpose then we can consider this as a separate patch
> because I don't see why these are required w.r.t this patch.
>

Yes it was for code readability in v17 when this function used to be
much larger. But it is not very necessary anymore and has been
reverted in the latest patch [v19].

> 4.
> - /*
> - * To build a slot name for the sync work, we are limited to NAMEDATALEN -
> - * 1 characters.  We cut the original slot name to NAMEDATALEN - 28 chars
> - * and append _%u_sync_%u (1 + 10 + 6 + 10 + '\0').  (It's actually the
> - * NAMEDATALEN on the remote that matters, but this scheme will also work
> - * reasonably if that is different.)
> - */
> - StaticAssertStmt(NAMEDATALEN >= 32, "NAMEDATALEN too small"); /* for sanity 
> */
> - slotname = psprintf("%.*s_%u_sync_%u",
> - NAMEDATALEN - 28,
> - MySubscription->slotname,
> - MySubscription->oid,
> - MyLogicalRepWorker->relid);
> + /* Calculate the name of the tablesync slot. */
> + slotname = ReplicationSlotNameForTablesync(
> +MySubscription->oid,
> +MyLogicalRepWorker->relid);
>
> What is the reason for changing the slot name calculation? If there is
> any particular reasons, then we can add a comment to indicate why we
> can't include the subscription's slotname in this calculation.
>

The tablesync slot name changes were not strictly necessary, so the
code is all reverted to be the same as OSS HEAD now in the latest
patch [v19].

> 5.
> This is WAL
> + * logged for for the purpose of recovery. Locks are to prevent the
> + * replication origin from vanishing while advancing.
>
> /for for/for
>

OK. Fixed in the latest patch [v19].

> 6.
> + /* Remove the tablesync's origin tracking if exists. */
> + snprintf(originname, sizeof(originname), "pg_%u_%u", subid, relid);
> + originid = replorigin_by_name(originname, true);
> + if (originid != InvalidRepOriginId)
> + {
> + elog(DEBUG1, "DropSubscription: dropping origin tracking for
> \"%s\"", originname);
>
> I don't think we need this and the DEBUG1 message in
> AlterSubscription_refresh. IT is fine to print this information for
> background workers like in apply-worker but not sure if need it here.
> The DropSubscription drops the origin of apply worker but it doesn't
> use such a DEBUG message so I guess we don't it for tablesync origins
> as well.
>

OK. These DEBUG1 logs are removed in the latest patch [v19].


[v19] 
https://www.postgresql.org/message-id/CAHut%2BPsj7Xm8C1LbqeAbk-3duyS8xXJtL9TiGaeu3P8g272mAA%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia




Re: Single transaction in the tablesync worker?

2021-01-24 Thread Amit Kapila
On Mon, Jan 25, 2021 at 6:15 AM Peter Smith  wrote:
>
> On Sun, Jan 24, 2021 at 5:54 PM Peter Smith  wrote:
> > > 4.
> > > - /*
> > > - * To build a slot name for the sync work, we are limited to NAMEDATALEN 
> > > -
> > > - * 1 characters.  We cut the original slot name to NAMEDATALEN - 28 chars
> > > - * and append _%u_sync_%u (1 + 10 + 6 + 10 + '\0').  (It's actually the
> > > - * NAMEDATALEN on the remote that matters, but this scheme will also work
> > > - * reasonably if that is different.)
> > > - */
> > > - StaticAssertStmt(NAMEDATALEN >= 32, "NAMEDATALEN too small"); /* for 
> > > sanity */
> > > - slotname = psprintf("%.*s_%u_sync_%u",
> > > - NAMEDATALEN - 28,
> > > - MySubscription->slotname,
> > > - MySubscription->oid,
> > > - MyLogicalRepWorker->relid);
> > > + /* Calculate the name of the tablesync slot. */
> > > + slotname = ReplicationSlotNameForTablesync(
> > > +MySubscription->oid,
> > > +MyLogicalRepWorker->relid);
> > >
> > > What is the reason for changing the slot name calculation? If there is
> > > any particular reasons, then we can add a comment to indicate why we
> > > can't include the subscription's slotname in this calculation.
> > >
> >
> > The subscription slot name may be changed (e.g. ALTER SUBSCRIPTION)
> > and so including the subscription slot name as part of the tablesync
> > slot name was considered to be:
> > a) possibly risky/undefined, if the subscription slot_name = NONE
> > b) confusing, if we end up using 2 different slot names for the same
> > tablesync (e.g. if the subscription slot name is changed before a sync
> > worker is re-launched).
> > And since this subscription slot name part is not necessary for
> > uniqueness anyway, it was removed from the tablesync slot name to
> > eliminate those concerns.
> >
> > Also, the tablesync slot name calculation was encapsulated as a
> > separate function because previously (i.e. before v18) it was used by
> > various other cleanup codes. I still like it better as a function, but
> > now it is only called from one place so we could put that code back
> > inline if you prefer it how it was..
>
> It turns out those (a/b) concerns I wrote above are maybe unfounded,
> because it seems not possible to alter the slot_name = NONE unless the
> subscription is first DISABLED.
>

Yeah, but I think the user can still change to some other predefined
slot_name. However, I guess it doesn't matter unless it can lead what
you have mentioned in (a). As that can't happen, it is probably better
to take out that change from the patch. I see your point of moving
this calculation to a separate function but not sure if it is worth it
unless we have to call it from multiple places or it simplifies the
existing code.

-- 
With Regards,
Amit Kapila.




Re: Single transaction in the tablesync worker?

2021-01-24 Thread Peter Smith
Hi Amit.

PSA the v19 patch for the Tablesync Solution1.

Main differences from v18:
+ Patch has been rebased off HEAD @ 24/Jan
+ Addressing some review comments [ak0123]

[ak0123] 
https://www.postgresql.org/message-id/CAA4eK1JhpuwujrV6ABMmZ3jXfW37ssZnJ3fikrY7rRdvoEmu_g%40mail.gmail.com



Features:

* The tablesync worker is now allowing multiple tx instead of single tx.

* A new state (SUBREL_STATE_FINISHEDCOPY) is persisted after a
successful copy_table in tablesync's LogicalRepSyncTableStart.

* If a re-launched tablesync finds state SUBREL_STATE_FINISHEDCOPY
then it will bypass the initial copy_table phase.

* Now tablesync sets up replication origin tracking in
LogicalRepSyncTableStart (similar as done for the apply worker). The
origin is advanced when first created.

* The tablesync replication origin tracking record is cleaned up by:
- process_syncing_tables_for_apply
- DropSubscription
- AlterSubscription_refresh

* Updates to PG docs.

* New TAP test case.

Known Issues:

* None.

---
Kind Regards,
Peter Smith.
Fujitsu Australia


v19-0001-Tablesync-Solution1.patch
Description: Binary data


v19-0002-Tablesync-extra-logging.patch
Description: Binary data


Re: Single transaction in the tablesync worker?

2021-01-24 Thread Peter Smith
On Sun, Jan 24, 2021 at 5:54 PM Peter Smith  wrote:
> > 4.
> > - /*
> > - * To build a slot name for the sync work, we are limited to NAMEDATALEN -
> > - * 1 characters.  We cut the original slot name to NAMEDATALEN - 28 chars
> > - * and append _%u_sync_%u (1 + 10 + 6 + 10 + '\0').  (It's actually the
> > - * NAMEDATALEN on the remote that matters, but this scheme will also work
> > - * reasonably if that is different.)
> > - */
> > - StaticAssertStmt(NAMEDATALEN >= 32, "NAMEDATALEN too small"); /* for 
> > sanity */
> > - slotname = psprintf("%.*s_%u_sync_%u",
> > - NAMEDATALEN - 28,
> > - MySubscription->slotname,
> > - MySubscription->oid,
> > - MyLogicalRepWorker->relid);
> > + /* Calculate the name of the tablesync slot. */
> > + slotname = ReplicationSlotNameForTablesync(
> > +MySubscription->oid,
> > +MyLogicalRepWorker->relid);
> >
> > What is the reason for changing the slot name calculation? If there is
> > any particular reasons, then we can add a comment to indicate why we
> > can't include the subscription's slotname in this calculation.
> >
>
> The subscription slot name may be changed (e.g. ALTER SUBSCRIPTION)
> and so including the subscription slot name as part of the tablesync
> slot name was considered to be:
> a) possibly risky/undefined, if the subscription slot_name = NONE
> b) confusing, if we end up using 2 different slot names for the same
> tablesync (e.g. if the subscription slot name is changed before a sync
> worker is re-launched).
> And since this subscription slot name part is not necessary for
> uniqueness anyway, it was removed from the tablesync slot name to
> eliminate those concerns.
>
> Also, the tablesync slot name calculation was encapsulated as a
> separate function because previously (i.e. before v18) it was used by
> various other cleanup codes. I still like it better as a function, but
> now it is only called from one place so we could put that code back
> inline if you prefer it how it was..

It turns out those (a/b) concerns I wrote above are maybe unfounded,
because it seems not possible to alter the slot_name = NONE unless the
subscription is first DISABLED.
So probably I can revert all this tablesync slot name calculation back
to how it originally was in the OSS HEAD if you want.


Kind Regards,
Peter Smith.
Fujitsu Australia




Re: Single transaction in the tablesync worker?

2021-01-23 Thread Peter Smith
On Sat, Jan 23, 2021 at 11:26 PM Amit Kapila  wrote:
>
> On Sat, Jan 23, 2021 at 4:55 AM Peter Smith  wrote:
> >
> > PSA the v18 patch for the Tablesync Solution1.
> >
>
> Few comments:
> =
> 1.
> - *   So the state progression is always: INIT -> DATASYNC -> SYNCWAIT ->
> - *   CATCHUP -> SYNCDONE -> READY.
> + *   So the state progression is always: INIT -> DATASYNC ->
> + *   (sync worker FINISHEDCOPY) -> SYNCWAIT -> CATCHUP -> SYNCDONE -> READY.
>
> I don't think we need to be specific here that sync worker sets
> FINISHEDCOPY state.
>

This was meant to indicate that *only* the sync worker knows about the
FINISHEDCOPY state, whereas all the other states are either known
(assigned and/or used) by *both* kinds of workers. But, I can remove
it if you feel that distinction is not useful.

> 4.
> - /*
> - * To build a slot name for the sync work, we are limited to NAMEDATALEN -
> - * 1 characters.  We cut the original slot name to NAMEDATALEN - 28 chars
> - * and append _%u_sync_%u (1 + 10 + 6 + 10 + '\0').  (It's actually the
> - * NAMEDATALEN on the remote that matters, but this scheme will also work
> - * reasonably if that is different.)
> - */
> - StaticAssertStmt(NAMEDATALEN >= 32, "NAMEDATALEN too small"); /* for sanity 
> */
> - slotname = psprintf("%.*s_%u_sync_%u",
> - NAMEDATALEN - 28,
> - MySubscription->slotname,
> - MySubscription->oid,
> - MyLogicalRepWorker->relid);
> + /* Calculate the name of the tablesync slot. */
> + slotname = ReplicationSlotNameForTablesync(
> +MySubscription->oid,
> +MyLogicalRepWorker->relid);
>
> What is the reason for changing the slot name calculation? If there is
> any particular reasons, then we can add a comment to indicate why we
> can't include the subscription's slotname in this calculation.
>

The subscription slot name may be changed (e.g. ALTER SUBSCRIPTION)
and so including the subscription slot name as part of the tablesync
slot name was considered to be:
a) possibly risky/undefined, if the subscription slot_name = NONE
b) confusing, if we end up using 2 different slot names for the same
tablesync (e.g. if the subscription slot name is changed before a sync
worker is re-launched).
And since this subscription slot name part is not necessary for
uniqueness anyway, it was removed from the tablesync slot name to
eliminate those concerns.

Also, the tablesync slot name calculation was encapsulated as a
separate function because previously (i.e. before v18) it was used by
various other cleanup codes. I still like it better as a function, but
now it is only called from one place so we could put that code back
inline if you prefer it how it was..


Kind Regards,
Peter Smith.
Fujitsu Australia




Re: Single transaction in the tablesync worker?

2021-01-23 Thread Peter Smith
FYI - I have done some long-running testing using the current patch [v18].

1. The src/test/subscription TAP tests:
- Subscription TAP tests were executed in a loop X 150 iterations.
- Duration 5 hrs.
- All iterations report "Result: PASS"

2. The postgres "make check" tests:
- make check was executed in a loop X 150 iterations.
- Duration 2 hrs.
- All iterations report "All 202 tests passed"

---
[v18] 
https://www.postgresql.org/message-id/CAHut%2BPvm0R%3DMn_uVN_JhK0scE54V6%2BEDGHJg1WYJx0Q8HX_mkQ%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia




Re: Single transaction in the tablesync worker?

2021-01-23 Thread Amit Kapila
On Sat, Jan 23, 2021 at 4:55 AM Peter Smith  wrote:
>
> PSA the v18 patch for the Tablesync Solution1.
>

Few comments:
=
1.
- *   So the state progression is always: INIT -> DATASYNC -> SYNCWAIT ->
- *   CATCHUP -> SYNCDONE -> READY.
+ *   So the state progression is always: INIT -> DATASYNC ->
+ *   (sync worker FINISHEDCOPY) -> SYNCWAIT -> CATCHUP -> SYNCDONE -> READY.

I don't think we need to be specific here that sync worker sets
FINISHEDCOPY state.

2.
@@ -98,11 +102,16 @@
 #include "miscadmin.h"
 #include "parser/parse_relation.h"
 #include "pgstat.h"
+#include "postmaster/interrupt.h"
 #include "replication/logicallauncher.h"
 #include "replication/logicalrelation.h"
+#include "replication/logicalworker.h"
 #include "replication/walreceiver.h"
 #include "replication/worker_internal.h"
+#include "replication/slot.h"

I don't think the above includes are required. They seem to the
remnant of the previous approach.

3.
 process_syncing_tables_for_sync(XLogRecPtr current_lsn)
 {
- Assert(IsTransactionState());
+ bool sync_done = false;

  SpinLockAcquire(>relmutex);
+ sync_done = MyLogicalRepWorker->relstate == SUBREL_STATE_CATCHUP &&
+ current_lsn >= MyLogicalRepWorker->relstate_lsn;
+ SpinLockRelease(>relmutex);

- if (MyLogicalRepWorker->relstate == SUBREL_STATE_CATCHUP &&
- current_lsn >= MyLogicalRepWorker->relstate_lsn)
+ if (sync_done)
  {
  TimeLineID tli;

+ /*
+ * Change state to SYNCDONE.
+ */
+ SpinLockAcquire(>relmutex);

Why do we need these changes? If you have done it for the
code-readability purpose then we can consider this as a separate patch
because I don't see why these are required w.r.t this patch.

4.
- /*
- * To build a slot name for the sync work, we are limited to NAMEDATALEN -
- * 1 characters.  We cut the original slot name to NAMEDATALEN - 28 chars
- * and append _%u_sync_%u (1 + 10 + 6 + 10 + '\0').  (It's actually the
- * NAMEDATALEN on the remote that matters, but this scheme will also work
- * reasonably if that is different.)
- */
- StaticAssertStmt(NAMEDATALEN >= 32, "NAMEDATALEN too small"); /* for sanity */
- slotname = psprintf("%.*s_%u_sync_%u",
- NAMEDATALEN - 28,
- MySubscription->slotname,
- MySubscription->oid,
- MyLogicalRepWorker->relid);
+ /* Calculate the name of the tablesync slot. */
+ slotname = ReplicationSlotNameForTablesync(
+MySubscription->oid,
+MyLogicalRepWorker->relid);

What is the reason for changing the slot name calculation? If there is
any particular reasons, then we can add a comment to indicate why we
can't include the subscription's slotname in this calculation.

5.
This is WAL
+ * logged for for the purpose of recovery. Locks are to prevent the
+ * replication origin from vanishing while advancing.

/for for/for

6.
+ /* Remove the tablesync's origin tracking if exists. */
+ snprintf(originname, sizeof(originname), "pg_%u_%u", subid, relid);
+ originid = replorigin_by_name(originname, true);
+ if (originid != InvalidRepOriginId)
+ {
+ elog(DEBUG1, "DropSubscription: dropping origin tracking for
\"%s\"", originname);

I don't think we need this and the DEBUG1 message in
AlterSubscription_refresh. IT is fine to print this information for
background workers like in apply-worker but not sure if need it here.
The DropSubscription drops the origin of apply worker but it doesn't
use such a DEBUG message so I guess we don't it for tablesync origins
as well.

7. Have you tested with the new patch the scenario where we crash
after FINISHEDCOPY and before SYNCDONE, is it able to pick up the
replication using the new temporary slot? Here, we need to test the
case where during the catchup phase we have received few commits and
then the tablesync worker is crashed/errored out? Basically, check if
the replication is continued from the same point? I understand that
this can be only tested by adding some logs and we might not be able
to write a test for it.

-- 
With Regards,
Amit Kapila.




Re: Single transaction in the tablesync worker?

2021-01-22 Thread Ajin Cherian
On Sat, Jan 23, 2021 at 3:16 PM Amit Kapila  wrote:

>
> I think so. But do you have any reason to believe that it won't be
> required anymore?

A temporary slot will not clash with a permanent slot of the same name,

regards,
Ajin Cherian
Fujitsu




  1   2   >