Re: Tablesync early exit

2022-04-04 Thread Amit Kapila
On Tue, Apr 5, 2022 at 9:37 AM Peter Smith  wrote:
>
> On Sat, Apr 2, 2022 at 5:17 PM Amit Kapila  wrote:
> >
> > On Fri, Apr 1, 2022 at 1:52 PM Peter Smith  wrote:
> > >
> > > On Wed, Mar 16, 2022 at 4:07 PM Amit Kapila  
> > > wrote:
> > >
> > > I think the STATE_CATCHUP guarantees the apply worker must have
> > > received (or tried to receive) a message. See the previous answer.
> > >
> >
> > Sorry, I intend to say till the sync worker has received any message.
> > The point is that LSN till where the copy has finished might actually
> > be later than some of the in-progress transactions on the server. It
> > may not be a good idea to blindly skip those changes if the apply
> > worker has already received those changes (say via a 'streaming'
> > mode). Today, all such changes would be written to the file and
> > applied at commit time but tomorrow, we can have an implementation
> > where we can apply such changes (via some background worker) by
> > skipping changes related to the table for which the table-sync worker
> > is in-progress. Now, in such a scenario, unless, we allow the table
> > sync worker to process more messages, we will end up losing some
> > changes for that particular table.
> >
> > As per my understanding, this is safe as per the current code but it
> > can't be guaranteed for future implementations and the amount of extra
> > work is additional work to receive the messages for one transaction. I
> > still don't think that it is a good idea to pursue this patch.
>
> IIUC you are saying that my patch is good today, but it may cause
> problems in a hypothetical future if the rest of the replication logic
> is implemented differently.
>

The approach I have alluded to above is already proposed earlier on
-hackers [1] to make streaming transactions perform better. So, it is
not completely hypothetical.

[1] - 
https://www.postgresql.org/message-id/8eda5118-2dd0-79a1-4fe9-eec7e334de17%40postgrespro.ru

-- 
With Regards,
Amit Kapila.




Re: Tablesync early exit

2022-04-04 Thread Peter Smith
On Sat, Apr 2, 2022 at 5:17 PM Amit Kapila  wrote:
>
> On Fri, Apr 1, 2022 at 1:52 PM Peter Smith  wrote:
> >
> > On Wed, Mar 16, 2022 at 4:07 PM Amit Kapila  wrote:
> >
> > I think the STATE_CATCHUP guarantees the apply worker must have
> > received (or tried to receive) a message. See the previous answer.
> >
>
> Sorry, I intend to say till the sync worker has received any message.
> The point is that LSN till where the copy has finished might actually
> be later than some of the in-progress transactions on the server. It
> may not be a good idea to blindly skip those changes if the apply
> worker has already received those changes (say via a 'streaming'
> mode). Today, all such changes would be written to the file and
> applied at commit time but tomorrow, we can have an implementation
> where we can apply such changes (via some background worker) by
> skipping changes related to the table for which the table-sync worker
> is in-progress. Now, in such a scenario, unless, we allow the table
> sync worker to process more messages, we will end up losing some
> changes for that particular table.
>
> As per my understanding, this is safe as per the current code but it
> can't be guaranteed for future implementations and the amount of extra
> work is additional work to receive the messages for one transaction. I
> still don't think that it is a good idea to pursue this patch.

IIUC you are saying that my patch is good today, but it may cause
problems in a hypothetical future if the rest of the replication logic
is implemented differently.

Anyway, it seems there is no chance of this getting committed, so it
is time for me to stop flogging this dead horse.

I will remove this from the CF.

--
Kind Regards,
Peter Smith
Fujitsu Australia




Re: Tablesync early exit

2022-04-02 Thread Amit Kapila
On Fri, Apr 1, 2022 at 1:52 PM Peter Smith  wrote:
>
> On Wed, Mar 16, 2022 at 4:07 PM Amit Kapila  wrote:
>
> I think the STATE_CATCHUP guarantees the apply worker must have
> received (or tried to receive) a message. See the previous answer.
>

Sorry, I intend to say till the sync worker has received any message.
The point is that LSN till where the copy has finished might actually
be later than some of the in-progress transactions on the server. It
may not be a good idea to blindly skip those changes if the apply
worker has already received those changes (say via a 'streaming'
mode). Today, all such changes would be written to the file and
applied at commit time but tomorrow, we can have an implementation
where we can apply such changes (via some background worker) by
skipping changes related to the table for which the table-sync worker
is in-progress. Now, in such a scenario, unless, we allow the table
sync worker to process more messages, we will end up losing some
changes for that particular table.

As per my understanding, this is safe as per the current code but it
can't be guaranteed for future implementations and the amount of extra
work is additional work to receive the messages for one transaction. I
still don't think that it is a good idea to pursue this patch.

-- 
With Regards,
Amit Kapila.




Re: Tablesync early exit

2022-04-01 Thread Peter Smith
On Wed, Mar 16, 2022 at 4:07 PM Amit Kapila  wrote:
>
> On Mon, Aug 30, 2021 at 8:50 AM Peter Smith  wrote:
> >
> > Patch v2 is the same; it only needed re-basing to the latest HEAD.
> >
>
> Why do you think it is correct to exit before trying to receive any
> message?

I think the STATE_CATCHUP state guarantees the apply worker must have
received (or tried to receive) a message. See the next answer.

> How will we ensure whether the apply worker has processed any
> message?

All this patch code does is call process_syncing_tables, which
delegates to process_syncing_tables_for_sync (because the call is from
a tablesync worker). This function code can’t do anything unless the
tablesync worker is in STATE_CATCHUP state, and that cannot happen
unless it was explicitly set to that state by the apply worker.

On the other side of the coin, the apply worker can only set that
syncworker->relstate = SUBREL_STATE_CATCHUP from within function
process_syncing_tables_for_apply, and AFAIK that function is only
called when the apply worker has either handled a message, (or the
walrcv_receive in the  LogicalRepApplyLoop received nothing).

So I think the STATE_CATCHUP mechanism itself ensures the apply worker
*must* have already processed a message (or there was no message to
process).

> At the beginning of function LogicalRepApplyLoop(),
> last_received is the LSN where the copy has finished in the case of
> tablesync worker. I think we need to receive the message before trying
> to ensure whether we have synced with the apply worker or not.
>

I think the STATE_CATCHUP guarantees the apply worker must have
received (or tried to receive) a message. See the previous answer.

~~~

AFAIK this patch is OK, but since it is not particularly urgent I've
bumped this to the next CommitFest [1] instead of trying to jam it
into PG15 at the last minute.

BTW - There were some useful logfiles I captured a very long time ago
[2]. They show the behaviour without/with this patch.

--
[1] https://commitfest.postgresql.org/37/3062/
[2] 
https://www.postgresql.org/message-id/cahut+ptjk-qgd3r1a1_tr62cmiswcyphuv0plmva-+2s8r0...@mail.gmail.com

Kind Regards,
Peter Smith
Fujitsu Australia




Re: Tablesync early exit

2022-03-15 Thread Amit Kapila
On Wed, Mar 16, 2022 at 1:16 AM Greg Stark  wrote:
>
> This patch has been through five CFs without any review. It's a very
> short patch and I'm assuming the only issue is that it's not clear
> whether it's a good idea or not and there are few developers familiar
> with this area of code?
>
> Amit, it looks like you were the one who asked for it to be split off
> from the logical decoding of 2PC patch in [1]. Can you summarize what
> questions remain here? Should we just commit this or is there any
> issue that needs to be debated?
>

Looking closely at this, I am not sure whether this is a good idea or
not. Responded accordingly.

-- 
With Regards,
Amit Kapila.




Re: Tablesync early exit

2022-03-15 Thread Amit Kapila
On Mon, Aug 30, 2021 at 8:50 AM Peter Smith  wrote:
>
> Patch v2 is the same; it only needed re-basing to the latest HEAD.
>

Why do you think it is correct to exit before trying to receive any
message? How will we ensure whether the apply worker has processed any
message? At the beginning of function LogicalRepApplyLoop(),
last_received is the LSN where the copy has finished in the case of
tablesync worker. I think we need to receive the message before trying
to ensure whether we have synced with the apply worker or not.

-- 
With Regards,
Amit Kapila.




Re: Tablesync early exit

2022-03-15 Thread Greg Stark
This patch has been through five CFs without any review. It's a very
short patch and I'm assuming the only issue is that it's not clear
whether it's a good idea or not and there are few developers familiar
with this area of code?

Amit, it looks like you were the one who asked for it to be split off
from the logical decoding of 2PC patch in [1]. Can you summarize what
questions remain here? Should we just commit this or is there any
issue that needs to be debated?

[1] 
https://www.postgresql.org/message-id/CAA4eK1Jxu-3qxtkfA_dKoquQgGZVcB+k9_-yT5=9gdew84t...@mail.gmail.com


On Sat, 6 Mar 2021 at 20:56, Peter Smith  wrote:
>
> Hi hackers.
>
> I propose a small optimization can be added to the tablesync replication code.
>
> This proposal (and simple patch) was first discussed here [1].
>
> Basic idea is the tablesync could/should detect if there is anything
> to do *before* it enters the apply main loop. Calling
> process_sync_tables() before the apply main loop offers a quick way
> out, so the message handling will not be unnecessarily between
> workers. This will be a small optimization.
>
> But also, IMO this is a more natural separation of work. E.g tablesync
> worker will finish when the table is synced - not go one extra step...
>
> ~~
>
> This patch was already successfully used for several versions
> (v43-v50) of another 2PC patch [2], but it was eventually removed from
> there because, although it has its own independent value, it was not
> required for that patch series [3].
>
> 
> [1] 
> https://www.postgresql.org/message-id/CAHut%2BPtjk-Qgd3R1a1_tr62CmiswcYphuv0pLmVA-%2B2s8r0Bkw%40mail.gmail.com
> [2] 
> https://www.postgresql.org/message-id/flat/CAHut%2BPsd5nyg-HG6rGO2_5jzXuSA1Eq5%2BB5J2VJo0Q2QWi-1HQ%40mail.gmail.com#1c2683756b32e267d96b7177ba95
> [3] 
> https://www.postgresql.org/message-id/CAA4eK1Jxu-3qxtkfA_dKoquQgGZVcB%2Bk9_-yT5%3D9GDEW84TF%2BA%40mail.gmail.com
>
> Kind Regards,
> Peter Smith.
> Fujitsu Australia
>
>


--
greg




Re: Tablesync early exit

2021-08-29 Thread Peter Smith
Patch v2 is the same; it only needed re-basing to the latest HEAD.


Kind Regards,
Peter Smith.
Fujitsu Australia


v2-0001-Tablesync-early-exit.patch
Description: Binary data


Re: Tablesync early exit

2021-03-08 Thread Peter Smith
On Sun, Mar 7, 2021 at 1:33 PM Amit Kapila  wrote:
>
> On Sun, Mar 7, 2021 at 7:26 AM Peter Smith  wrote:
> >
> > Hi hackers.
> >
> > I propose a small optimization can be added to the tablesync replication 
> > code.
> >
> > This proposal (and simple patch) was first discussed here [1].
> >
>
> It might be better if you attach your proposed patch to this thread.

PSA.


Kind Regards,
Peter Smith.
Fujitsu Australia


v1-0001-Tablesync-early-exit.patch
Description: Binary data


Re: Tablesync early exit

2021-03-06 Thread Amit Kapila
On Sun, Mar 7, 2021 at 7:26 AM Peter Smith  wrote:
>
> Hi hackers.
>
> I propose a small optimization can be added to the tablesync replication code.
>
> This proposal (and simple patch) was first discussed here [1].
>

It might be better if you attach your proposed patch to this thread.

-- 
With Regards,
Amit Kapila.




Tablesync early exit

2021-03-06 Thread Peter Smith
Hi hackers.

I propose a small optimization can be added to the tablesync replication code.

This proposal (and simple patch) was first discussed here [1].

Basic idea is the tablesync could/should detect if there is anything
to do *before* it enters the apply main loop. Calling
process_sync_tables() before the apply main loop offers a quick way
out, so the message handling will not be unnecessarily between
workers. This will be a small optimization.

But also, IMO this is a more natural separation of work. E.g tablesync
worker will finish when the table is synced - not go one extra step...

~~

This patch was already successfully used for several versions
(v43-v50) of another 2PC patch [2], but it was eventually removed from
there because, although it has its own independent value, it was not
required for that patch series [3].


[1] 
https://www.postgresql.org/message-id/CAHut%2BPtjk-Qgd3R1a1_tr62CmiswcYphuv0pLmVA-%2B2s8r0Bkw%40mail.gmail.com
[2] 
https://www.postgresql.org/message-id/flat/CAHut%2BPsd5nyg-HG6rGO2_5jzXuSA1Eq5%2BB5J2VJo0Q2QWi-1HQ%40mail.gmail.com#1c2683756b32e267d96b7177ba95
[3] 
https://www.postgresql.org/message-id/CAA4eK1Jxu-3qxtkfA_dKoquQgGZVcB%2Bk9_-yT5%3D9GDEW84TF%2BA%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia