Re: Improve pg_sync_replication_slots() to wait for primary to advance

2025-08-05 Thread Ajin Cherian
rrent proposed change in low-level functions appears to be > difficult to maintain, especially the change proposed in > update_and_persist_local_synced_slot(). If we can find a better way to > achieve the same then we can consider the current approach as well. > Next patch, I'll work on addressing this comment. I'll need to restructure the code to make this happen. regards, Ajin Cherian Fujitsu Australia

Re: Improve pg_sync_replication_slots() to wait for primary to advance

2025-08-05 Thread Ajin Cherian
> invalidated on primary > I've changed it to LOG now. > > 6) > - * We do not drop the slot because the restart_lsn can be ahead of the > - * current location when recreating the slot in the next cycle. It may > - * take more time to create such a slot. Therefore, we keep this slot > - * and attempt the synchronization in the next cycle. > + * If we're in the slotsync worker, we retain the slot and retry in the > + * next cycle. The restart_lsn might advance by then, allowing the slot > + * to be created successfully later. > */ > > I like the previous command better as it was conveying the side-effect > of dropping the slot herer. Can we try to incorporate the previous > comment here and make it specific to slotsync workers? > Reverted to the previous version. Attaching patch v4 which addresses these comments. regards, Ajin Cherian Fujitsu Australia v4-0001-Improve-initial-slot-synchronization-in-pg_sync_r.patch Description: Binary data

Re: 024_add_drop_pub.pl might fail due to deadlock

2025-07-31 Thread Ajin Cherian
On Thu, Jul 31, 2025 at 7:07 PM vignesh C wrote: > > On Thu, 31 Jul 2025 at 08:23, Ajin Cherian wrote: > > > > On Wed, Jul 30, 2025 at 10:33 PM Hayato Kuroda (Fujitsu) > > wrote: > > > > > > Dear Ajin, > > > > > > > Attaching the up

Re: Improve pg_sync_replication_slots() to wait for primary to advance

2025-07-31 Thread Ajin Cherian
On Thu, Jul 17, 2025 at 2:04 PM shveta malik wrote: > > On Wed, Jul 16, 2025 at 3:47 PM Ajin Cherian wrote: > > > > > I am not able to apply the patch to the latest head or even to a week > > > back version. Can you please check and rebase? > > > &g

Re: 024_add_drop_pub.pl might fail due to deadlock

2025-07-30 Thread Ajin Cherian
one mail. > > Thanks for update, but the patch for PG18/HEAD seemed not to have Assert(). > Can you modify like others do? > Updated patch for PG18/HEAD. regards, Ajin Cherian Fujitsu Australia PG_18_HEAD_v7-0001-Fix-a-possible-deadlock-during-ALTER-SUBSCRIPTION.patch Description: Binary data

Re: 024_add_drop_pub.pl might fail due to deadlock

2025-07-30 Thread Ajin Cherian
ttached old version patch. > PSA the correct version. Attaching the updated patches with the changes you requested. I've also added the unchanged patches for PG_18 and HEAD (PG_18_HEAD-v6*), so that everything is together in one mail. regards, Ajin Cherian Fujitsu Australia PG_17-v7-0001-Fi

Re: 024_add_drop_pub.pl might fail due to deadlock

2025-07-29 Thread Ajin Cherian
5 and PG_18_HEAD-v6-0001-Fix-a-possible-deadlock-during-ALTER-SUBSCRIPTION is for both PG_18 and HEAD. regards, Ajin Cherian Fujitsu Australia PG_14_15-v6-0001-Fix-a-deadlock-during-ALTER-SUBSCRIPTION.-DROP-PU.patch Description: Binary data PG_16-v6-0001-Fix-a-deadlock-during-ALTER-SUBSCRIPT

Re: 024_add_drop_pub.pl might fail due to deadlock

2025-07-28 Thread Ajin Cherian
h for HEAD/PG15/PG14, what about PG18, 17, 16 and 13? > If not needed, why? Yes, patches are needed for all these versions except PG13 because replicating origin is not there in PG13. If this logic is fine, I will create patches for all these branches as well and send them tomorrow. regard

Re: 024_add_drop_pub.pl might fail due to deadlock

2025-07-28 Thread Ajin Cherian
dateSubscriptionRelState itself. I've tested this and this works fine. If this logic is acceptable to the reviewers I can update the other patches also in a similar way. regards, Ajin Cherian Fujitsu Australia PG_15-v5-0001-Fix-a-deadlock-during-ALTER-SUBSCRIPTION.-DROP-PU.patch Description: Binary data

Re: 024_add_drop_pub.pl might fail due to deadlock

2025-07-24 Thread Ajin Cherian
to confirm one point. For which branch should be backpatch? Initially > it was reported only on PG15 [1], but I found 021_alter_sub_pub could fail on > PG14. Yes, here's a patch for PG14 as well, based on REL_14_STABLE. regards, Ajin Cherian Fujitsu Australia PG_14-v4-0001-Fix-a-deadl

Re: 024_add_drop_pub.pl might fail due to deadlock

2025-07-23 Thread Ajin Cherian
On Wed, Jul 23, 2025 at 4:32 PM Amit Kapila wrote: > > On Mon, Jul 21, 2025 at 6:59 PM Ajin Cherian wrote: > > > > Yes, this is correct. I have also verified this in my testing that > > when there is a second subscription, a deadlock can still happen with > >

Re: 024_add_drop_pub.pl might fail due to deadlock

2025-07-22 Thread Ajin Cherian
e similar deadlock will not happen. But just to keep code consistent, I have made a similar fix. regards, Ajin Cherian Fujitsu Australia HEAD-v3-0001-Fix-a-possible-deadlock-during-ALTER-SUBSCRIPTION.patch Description: Binary data

Re: 024_add_drop_pub.pl might fail due to deadlock

2025-07-21 Thread Ajin Cherian
on SubscriptionRelRelationId is not taken before dropping the tracking origin. I've also modified the signature of UpdateSubscriptionRelState to take a bool "lock_needed" which if false, the SubscriptionRelationId is not locked. I've made a new version of the patch on PG_

Re: Improve pg_sync_replication_slots() to wait for primary to advance

2025-07-16 Thread Ajin Cherian
> I am not able to apply the patch to the latest head or even to a week > back version. Can you please check and rebase? > > thanks > Shveta Rebased. Regards, Ajin Cherian Fujitsu Australia. v2-0001-Improve-initial-slot-synchronization-in-pg_sync_r.patch Description: Binary data

Re: Improve pg_sync_replication_slots() to wait for primary to advance

2025-07-16 Thread Ajin Cherian
firmed_lsn) || > !TransactionIdIsValid(remote_slot->catalog_xmin)) && > remote_slot->invalidated == RS_INVAL_NONE) > pfree(remote_slot); > > > Due to the above check in synchronize_slots(), we will not reach > wait_for_primary_slot_catchup() when any of confirmed_lsn or > catalog_xmin is not initialized. > Yes, you are correct. I've removed all that logic. The modified patch (v2) is attached. regards, Ajin Cherian Fujitsu Australia v2-0001-Improve-initial-slot-synchronization-in-pg_sync_r.patch Description: Binary data

Re: 024_add_drop_pub.pl might fail due to deadlock

2025-07-15 Thread Ajin Cherian
before it > finishes executing, but if an error causes the tablesync worker to > fail just prior to > dropping the origin, the apply worker will later find the origin and drop it. Thanks for the test and confirming the fix. Fixed the comments. regards, Ajin Cherian Fujitsu Australia HEAD-v2-0001-Fix-a-possible-deadlock-during-ALTER-SUBSCRIPTION.patch Description: Binary data

Re: 024_add_drop_pub.pl might fail due to deadlock

2025-07-14 Thread Ajin Cherian
On Tue, Jul 8, 2025 at 8:41 PM Ajin Cherian wrote: > > Patch with fix attached. > I'll continue investigating whether this issue also affects HEAD. > While debugging if this problem can occur on HEAD, I found out that on head, it is mostly the tablesync worker that drops the o

Re: 024_add_drop_pub.pl might fail due to deadlock

2025-07-14 Thread Ajin Cherian
On Tue, Jul 8, 2025 at 8:41 PM Ajin Cherian wrote: > Proposed fix: > In process_syncing_tables_for_apply(), acquire an AccessExclusiveLock > on SubscriptionRelRelationId before acquiring the lock on > ReplicationOriginRelationId. > > Patch with fix attached. > I'll cont

Re: 024_add_drop_pub.pl might fail due to deadlock

2025-07-08 Thread Ajin Cherian
On Mon, Jul 7, 2025 at 8:15 PM Ajin Cherian wrote: > > On Sun, Jul 6, 2025 at 2:00 AM Alexander Lakhin wrote: > > > > --- a/src/backend/replication/logical/origin.c > > +++ b/src/backend/replication/logical/origin.c > > @@ -428,6 +428,7 @@ replorigin_drop_

Re: 024_add_drop_pub.pl might fail due to deadlock

2025-07-07 Thread Ajin Cherian
was added in v15). > > [1] > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=petalura&dt=2025-07-01%2018%3A00%3A58 > > Best regards, > Alexander > Hi Alexander, Yes, the problem can be reproduced by the changes you suggested. I will look into what is happening and how we can fix this. regards, Ajin Cherian Fujitsu Australia

Re: Restrict publishing of partitioned table with a foreign table as partition

2025-06-30 Thread Ajin Cherian
On Mon, Jun 9, 2025 at 3:58 PM Shlok Kyal wrote: > > On Wed, 4 Jun 2025 at 16:12, Ajin Cherian wrote: > > > > On Tue, May 20, 2025 at 2:33 AM Shlok Kyal wrote: > > > > > > This approach seems better to me. I have created a patch with the > > > a

Re: Historic snapshot doesn't track txns committed in BUILDING_SNAPSHOT state

2025-06-27 Thread Ajin Cherian
_snapshot_lsn' which we can use. If base_snapshot_lsn < builder->start_decoding_at, then we should rebuild the snapshot. Just a thought. regards, Ajin Cherian Fujitsu Australia [1] - https://www.postgresql.org/message-id/18509-983f064d174ea880%40postgresql.org

Re: Improve pg_sync_replication_slots() to wait for primary to advance

2025-06-25 Thread Ajin Cherian
Sorry, I attached the wrong file. Attaching the correct file now. regards, Ajin Cerian Fujitsu Australia 0001-Improve-initial-slot-synchronization-in-pg_sync_repl.patch Description: Binary data

Improve pg_sync_replication_slots() to wait for primary to advance

2025-06-24 Thread Ajin Cherian
up. We could later add a timeout parameter to control maximum wait time if this approach seems acceptable. If there are more ideas on improving this patch, let me know. regards, Ajin Cherian [1] - https://www.postgresql.org/message-id/CAF1DzPWTcg%2Bm%2Bx%2BoVVB%3Dy4q9%3DPYYsL_mujVp7uJr-_oUtWNGbA

Re: Fix slot synchronization with two_phase decoding enabled

2025-06-11 Thread Ajin Cherian
wophasestate of pg_subscription to know the actual internal state of the two-phase mode" (with appropriate tags and reference links) Patch 0002 1. Commit message Typo: change "subscrption" to "subscription" regards, Ajin Cherian Fujitsu Australia

Re: Proposal: Filter irrelevant change before reassemble transactions during logical decoding

2025-06-06 Thread Ajin Cherian
On Wed, Jun 4, 2025 at 8:22 PM Ajin Cherian wrote: > > On Tue, Jun 3, 2025 at 4:25 PM Ajin Cherian wrote: > > > > On Tue, Jun 3, 2025 at 3:55 PM Amit Kapila wrote: > > > > > > > > > You haven't shared the exact test scenario, but I am

Re: Restrict publishing of partitioned table with a foreign table as partition

2025-06-04 Thread Ajin Cherian
true and contains partitioned tables with foreign table as partition ", + list_length(publist), pubnames->data), I think you meant "publish_via_partition_root" here and not "publish_via_root_partition ". regards, Ajin Cherian Fujitsu Australia

Re: Proposal: Filter irrelevant change before reassemble transactions during logical decoding

2025-06-04 Thread Ajin Cherian
On Tue, Jun 3, 2025 at 4:25 PM Ajin Cherian wrote: > > On Tue, Jun 3, 2025 at 3:55 PM Amit Kapila wrote: > > > > > > You haven't shared the exact test scenario, but I am assuming the > > above tests are for very large transactions, as you are comparing > &g

Re: Proposal: Filter irrelevant change before reassemble transactions during logical decoding

2025-06-02 Thread Ajin Cherian
size (say one insert or one update, or one delete) as > well? > Attaching the scripts I used for my tests. Yes, I used transactions with large inserts. I will redo the tests with short single inserts and share the results here. regards, Ajin Cherian Fujitsu Australia <>

Re: Logical Replication of sequences

2025-05-28 Thread Ajin Cherian
_sequence_state() + + UnlockReleaseBuffer(buf); + relation_close(seqrel, NoLock); Ideally the corresponding close for init_sequence is sequence_close() rather than relation_close() I will post comments for the other patches as well. regards, Ajin Cherian Fujitsu Australia

Re: Proposal: Filter irrelevant change before reassemble transactions during logical decoding

2025-04-30 Thread Ajin Cherian
ring DELETE/, > > 'unpublished DELETE is filtered'); > > > > $log_location = -s $node_publisher->logfile; > > $node_publisher->safe_psql('postgres', "INSERT INTO delete_only_table > > VALUES (1, 'to be deleted')"); > >

Re: Proposal: Filter irrelevant change before reassemble transactions during logical decoding

2025-04-16 Thread Ajin Cherian
;. Am I missing > something here? > I have also added the same comment in point 1. in [2]. > Fixed. On Sat, Apr 5, 2025 at 12:34 AM Shlok Kyal wrote: > > On Thu, 20 Mar 2025 at 18:09, Ajin Cherian wrote: > > > > On Thu, Mar 13, 2025 at 5:49 PM Ajin Cherian wrote:

Re: Historic snapshot doesn't track txns committed in BUILDING_SNAPSHOT state

2025-03-28 Thread Ajin Cherian
re a bug on HEAD that this is solving? If so, can you write a test case to show this? This will make it more convincing that this change as well is required. Your other change has a test case to confirm, but this doesn't. regards, Ajin Cherian Fujitsu Australia

Re: Historic snapshot doesn't track txns committed in BUILDING_SNAPSHOT state

2025-03-26 Thread Ajin Cherian
#x27;t see an explanation as to why this change was added? Maybe I missed it in the multiple threads. With this, we are no longer decoding changes which were in the state SNAPBUILD_FULL_SNAPSHOT. Why did that change? regards, Ajin Cherian Fujitsu Australia

Re: Historic snapshot doesn't track txns committed in BUILDING_SNAPSHOT state

2025-03-26 Thread Ajin Cherian
ges(ctx->reorder, xid, buf->origptr); +} Any reason why you avoided calling SnapBuildProcessNewCid here and only called ReorderBufferXidSetCatalogChanges? If any, please mention in the comments the reason. regards, Ajin Cherian Fujitsu Australia

Re: Proposal: Filter irrelevant change before reassemble transactions during logical decoding

2025-03-20 Thread Ajin Cherian
On Thu, Mar 13, 2025 at 5:49 PM Ajin Cherian wrote: > > Moving this patch to the next CF as this patch needs more design level > inputs which may not be feasible in this CF but do continue to review > the patch. > > regards, > Ajin Cherian > Fujitsu Australia Rebased t

Re: Proposal: Filter irrelevant change before reassemble transactions during logical decoding

2025-03-12 Thread Ajin Cherian
Moving this patch to the next CF as this patch needs more design level inputs which may not be feasible in this CF but do continue to review the patch. regards, Ajin Cherian Fujitsu Australia

Re: Proposal: Filter irrelevant change before reassemble transactions during logical decoding

2025-03-12 Thread Ajin Cherian
On Mon, Mar 3, 2025 at 9:10 PM Ajin Cherian wrote: > > On Thu, Feb 20, 2025 at 8:42 PM Hayato Kuroda (Fujitsu) > wrote: > > I confirmed with tests that with the patch, a lot less disk space is used when there are lots of unpublished changes and the subscription is not configured

Re: Enhance 'pg_createsubscriber' to retrieve databases automatically when no database is provided.

2025-03-09 Thread Ajin Cherian
On Mon, Mar 10, 2025 at 3:58 PM Ashutosh Bapat wrote: > > On Thu, Mar 6, 2025 at 9:18 AM Ajin Cherian wrote: > > > + Subscription names, publication names, and replication slot names > > are > > + automatically generated. Cannot be used together

Re: Enhance 'pg_createsubscriber' to retrieve databases automatically when no database is provided.

2025-03-05 Thread Ajin Cherian
iptions (not template databases) change to "Verify that only user databases have subscriptions" regards, Ajin Cherian Fujitsu Australia

Re: Proposal: Filter irrelevant change before reassemble transactions during logical decoding

2025-03-03 Thread Ajin Cherian
erhead of performing a hash table > * search for each record, especially when most changes are not filterable. > */ > -#define CHANGES_THRESHOLD_FOR_FILTER 100 > +#define CHANGES_THRESHOLD_FOR_FILTER 0 > > Why is this defined as 0? Some accidental residue from performance &g

Re: Proposal: Filter irrelevant change before reassemble transactions during logical decoding

2025-02-24 Thread Ajin Cherian
On Tue, Feb 25, 2025 at 3:26 PM Ajin Cherian wrote: > > On Fri, Feb 21, 2025 at 2:24 PM Amit Kapila wrote: > > > > On Fri, Feb 21, 2025 at 7:57 AM Ajin Cherian wrote: > > > In these tests, I also see an increased performance with the patch > > > even when al

Re: Proposal: Filter irrelevant change before reassemble transactions during logical decoding

2025-02-24 Thread Ajin Cherian
On Fri, Feb 21, 2025 at 2:24 PM Amit Kapila wrote: > > On Fri, Feb 21, 2025 at 7:57 AM Ajin Cherian wrote: > > In these tests, I also see an increased performance with the patch > > even when all transactions are published. I will investigate why this > > happens and up

Re: Proposal: Filter irrelevant change before reassemble transactions during logical decoding

2025-02-20 Thread Ajin Cherian
On Fri, Feb 21, 2025 at 12:57 PM Ajin Cherian wrote: > > On Thu, Feb 20, 2025 at 3:08 PM Hayato Kuroda (Fujitsu) > wrote: > > > > Dear Ajin, > > > > > I compared the patch 1 which does not employ a hash cache and has the > > > overhead of starting a

Re: Proposal: Filter irrelevant change before reassemble transactions during logical decoding

2025-02-20 Thread Ajin Cherian
d all transactions on published tables Conclusion: The patched code with 100 transaction throttling significantly improves performance, reducing execution time by ~69% when no published transactions are involved, ~43% with partial published transactions, and ~15% in all published transactions. Attach

Re: Proposal: Filter irrelevant change before reassemble transactions during logical decoding

2025-02-19 Thread Ajin Cherian
On Thu, Feb 20, 2025 at 1:30 PM Ajin Cherian wrote: > > On Fri, Feb 14, 2025 at 6:18 PM Amit Kapila wrote: > > > > On Wed, Feb 12, 2025 at 10:41 AM Ajin Cherian wrote: > > > > > > On Wed, Jan 29, 2025 at 9:31 AM Peter Smith wrote: > > > > >

Re: Proposal: Filter irrelevant change before reassemble transactions during logical decoding

2025-02-17 Thread Ajin Cherian
ottling. I think we need a way for the user to set this threshold via a GUC and that can be used for testing. regards, Ajin Cherian Fujitsu Australia v14-0001-Filter-transactions-that-need-not-be-published.patch Description: Binary data v14-0002-Introduce-a-output-plugin-callback-to-filter-cha.patch Description: Binary data

Re: Proposal: Filter irrelevant change before reassemble transactions during logical decoding

2025-02-11 Thread Ajin Cherian
f concurrent aborts. 2. I think we can extend the skip mechanism to UPDATE/DELETE/MultiInsert/SpecConfirm. Regarding the TRUNCATE, I'm not sure we can handle hte TRUNCATE case because the we can't track RelFileLocator anymore. Updated. On Tue, Feb 4, 2025 at 2:19 PM vignesh C wrote:

Re: Proposal: Filter irrelevant change before reassemble transactions during logical decoding

2025-01-27 Thread Ajin Cherian
is is the second attempt) > > > Thanks Hou-san, Here's a patch-set created based on the design changes proposed by Hou-san. P.S: It has reached v12 after internal review/rework iterations Regards, Ajin Cherian Fujitsu Australia v12-0001-Track-transactions-with-internal-snap

Re: Adding a '--two-phase' option to 'pg_createsubscriber' utility.

2025-01-15 Thread Ajin Cherian
pared transactions will be replicated at COMMIT PREPARED."); } I think this code needs to be updated as well. regards, Ajin Cherian Fujitsu Australia

Re: Adding a '--two-phase' option to 'pg_createsubscriber' utility.

2025-01-12 Thread Ajin Cherian
On Fri, Jan 10, 2025 at 9:08 PM Ajin Cherian wrote: > > On Fri, Dec 27, 2024 at 5:36 PM Shubham Khanna > wrote: > > > The patch no longer applies on HEAD. Please do rebase. > Sorry, I was mistaken. Ignore this. The patch does apply on HEAD. regards, Ajin Cherian

Re: Adding a '--two-phase' option to 'pg_createsubscriber' utility.

2025-01-10 Thread Ajin Cherian
On Fri, Dec 27, 2024 at 5:36 PM Shubham Khanna wrote: > The patch no longer applies on HEAD. Please do rebase. regards, Ajin Cherian Fujitsu Australia

Re: Pgoutput not capturing the generated columns

2024-11-03 Thread Ajin Cherian
On Mon, Nov 4, 2024 at 2:20 PM Amit Kapila wrote: > > On Thu, Oct 31, 2024 at 4:26 PM Ajin Cherian wrote: > > > > 5. Verified that publications with different column list are disallowed to > > be subscribed by one subscription > >a. PUB_A(column list = (a,

Re: Pgoutput not capturing the generated columns

2024-10-31 Thread Ajin Cherian
On Thu, Oct 31, 2024 at 9:55 PM Ajin Cherian wrote: > I ran some tests and verified that the patch works with previous versions > of PG12 and PG17 > 1. Verified with publications with generated columns and without generated > columns on patched code and subscriptions on P

Re: Pgoutput not capturing the generated columns

2024-10-31 Thread Ajin Cherian
olumn list, without publish_generated_column) PUB_B(no column list, with publish_generated_column) - FAIL Tests did not show any unexpected behaviour. regards, Ajin Cherian Fujitsu Australia

Re: Conflict Detection and Resolution

2024-09-19 Thread Ajin Cherian
Can we change this to use the new foreach_ptr implementations added: > + foreach(lc, stmtresolvers) > + { > + DefElem*defel = (DefElem *) lfirst(lc); > + ConflictType type; > + char *resolver; > > to use foreach_ptr like: > foreach_ptr(DefElem, defel, stmtresolvers) > { > + ConflictType type; > + char *resolver; > > } > Changed accordingly. regards, Ajin Cherian Fujitsu Australia

Re: Conflict Detection and Resolution

2024-08-27 Thread Ajin Cherian
t; > I am not sure if we need to dump default resolvers. Would like to know > what others think on this. > > 3) > Why in 002_pg_dump.pl we have default resolvers set explicitly? > > In 003_pg_dump.pl, default resolvers are not set explicitly, that is the regexp to check the pg_dum

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-08-13 Thread Ajin Cherian
alculation of resource limits really required here when the same is already done inside InvalidateObsoleteReplicationSlots() regards, Ajin Cherian Fujitsu Australia

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-08-12 Thread Ajin Cherian
n. This has been the > approach until v40 patch. > b) Checkpointer and/or other backends add an autovacuum work item via > AutoVacuumRequestWork(), and autovacuum when it gets to it will > invalidate the replication slots. But, what to do for the vacuum > command here? > > Please find the attached v41 patches implementing the idea of vacuum > doing the invalidation. > > Thoughts? > > Thanks to Sawada-san for a detailed off-list discussion. > The patch no longer applies on HEAD, please rebase. regards, Ajin Cherian Fujitsu Australia

Re: Conflict Detection and Resolution

2024-07-01 Thread Ajin Cherian
1) 2024-07-01 02:52:59.427 EDT [20304] CONTEXT: processing remote data for replication origin "pg_16417" during message type "COMMIT" in transaction 763, finished at 0/15E7F68 2024-07-01 02:52:59.427 EDT [20304] WARNING: resource was not closed: TupleDesc 0x7f8c0439e448 (16402,-1)

Re: Proposal: Filter irrelevant change before reassemble transactions during logical decoding

2024-06-04 Thread Ajin Cherian
On Wed, May 22, 2024 at 2:17 PM Ajin Cherian wrote: > > The reason for the crash is that the RelationSyncCache was NULL prior to > reaching a consistent point. > Hi li jie, I see that you created a new thread with an updated version of > this patch [1]. I used that patch and addr

Re: Proposal: Filter irrelevant change before reassemble transactions during logical decoding

2024-05-21 Thread Ajin Cherian
On Tue, Mar 19, 2024 at 1:49 PM Ajin Cherian wrote: > > >> Of course you can, but this will only convert disk space into memory >> space. >> For details, please see the case in Email [1]. >> >> [1] >> https://www.postgresql.org/message-id/CAGfChW51P944

Re: Slow catchup of 2PC (twophase) transactions on replica in LR

2024-04-22 Thread Ajin Cherian
On Thu, Apr 18, 2024 at 4:26 PM Ajin Cherian wrote: > > Attaching the patch for your review and comments. Big thanks to Kuroda-san > for also working on the patch. > > Looking at this a bit more, maybe rolling back all prepared transactions on the subscriber when toggling two_pha

Re: Slow catchup of 2PC (twophase) transactions on replica in LR

2024-04-17 Thread Ajin Cherian
decoded as the pending prepared transactions are prior to the "two_phase_at" LSN. With this patch, now we are able to handle both pending prepared transactions when altering two_phase from true to false as well as false to true. Attaching the patch for your review and comments. Big thanks to K

Re: Slow catchup of 2PC (twophase) transactions on replica in LR

2024-04-16 Thread Ajin Cherian
that me that the > problem takes place. > > > Yes, I was able to reproduce the slow catchup of twophase transactions with pgbench with 20 clients. regards, Ajin Cherian Fujitsu Australia

Re: Slow catchup of 2PC (twophase) transactions on replica in LR

2024-04-05 Thread Ajin Cherian
r. 2. No solution yet. 3. We could mandate that the altering of two_phase state only be done after disabling the subscription, just like how it is handled for failover option. Let me know your thoughts. regards, Ajin Cherian Fujitsu Australia v2-0001-Allow-altering-of-two_phase-option-of-a-SUBSCRIPT.patch Description: Binary data

Re: Slow catchup of 2PC (twophase) transactions on replica in LR

2024-04-03 Thread Ajin Cherian
Kuroda-san for working on the patch. regards, Ajin Cherian Fujitsu Australia v1-0001-Allow-altering-of-two_phase-option-in-subscribers.patch Description: Binary data

Re: Skip collecting decoded changes of already-aborted transactions

2024-03-27 Thread Ajin Cherian
herwise return false. we discards/we discard 2. In function ReorderBufferCheckTXNAbort(): I haven't tested this but I wonder how prepared transactions would be considered, they are neither committed, nor in progress. regards, Ajin Cherian Fujitsu Australia

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-26 Thread Ajin Cherian
Lock held. Maybe do this prior to acquiring the spinlock. regards, Ajin Cherian Fujitsu Australia

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-22 Thread Ajin Cherian
ilover != failover) { MyReplicationSlot->data.failover = failover; } if (MyReplicationSlot->data.inactive_timeout != inactive_timeout) { MyReplicationSlot->data.inactive_timeout = inactive_timeout; } if (lock_acquired) { SpinLockRelease(&MyReplicationSlot->mutex); ReplicationSlotMarkDirty(); ReplicationSlotSave(); } 2. In patch 0005: why change walrcv_alter_slot option? it doesn't seem to be used anywhere, any use case for it? If required, would the intention be to add this as a Create Subscription option? regards, Ajin Cherian Fujitsu Australia

Re: Proposal: Filter irrelevant change before reassemble transactions during logical decoding

2024-03-18 Thread Ajin Cherian
he publication publishes "ALL TABLES"? 5. In function: pgoutput_table_filter() - this code appears to be filtering out not just unpublished tables but also applying row based filters on published tables as well. Is this really within the scope of the feature? regards, Ajin Cherian Fujitsu Australia

Re: Skip collecting decoded changes of already-aborted transactions

2024-03-14 Thread Ajin Cherian
check the + * transaction status so the caller always process this transaction. + */ + if (debug_logical_replication_streaming == DEBUG_LOGICAL_REP_STREAMING_IMMEDIATE) + return false; /process/processes regards, Ajin Cherian Fujitsu Australia

Re: Race condition in FetchTableStates() breaks synchronization of subscription tables

2024-03-11 Thread Ajin Cherian
, it doesn't fit well. regards, Ajin Cherian Fujitsu Australia

Re: Synchronizing slots from primary to standby

2024-03-07 Thread Ajin Cherian
Run2: 627.924183 secs There was no degradation in performance seen. Thanks Nisha for helping with the testing. regards, Ajin Cherian Fujitsu Australia

Re: Synchronizing slots from primary to standby

2024-03-01 Thread Ajin Cherian
dback=on, standby_slot_names not configured, logical subscriber not failover enabled, physical standby not configured for sync RUN1 (TPS) RUN2 (TPS) AVERAGE (TPS) 8.18839 8.18839 8.18839-- degradation from first config *-0.17%* PATCHED code - (v98-0001) Synchronous_commit=on, hot_standby_feedback=on, standby_slot_names configured to physical standby, logical subscriber failover enabled, physical standby configured for sync RUN1 (TPS) RUN2 (TPS) AVERAGE (TPS) 8.173062 8.068536 8.120799-- degradation from first config* -0.99%* Overall, I do not see any significant performance degradation with the patch and sync-slot enabled with one logical subscriber and one physical standby. Attaching script for my final test configuration for reference. regards, Ajin Cherian Fujitsu Australia <>

Re: Slow catchup of 2PC (twophase) transactions on replica in LR

2024-02-22 Thread Ajin Cherian
ROM test WHERE v = pg_backend_pid(); INSERT INTO test(v) SELECT pg_backend_pid(); PREPARE TRANSACTION $$:mygid$$; COMMIT PREPARED $$:mygid$$; regards, Ajin Cherian Fujitsu Australia

Re: Have pg_basebackup write "dbname" in "primary_conninfo"?

2024-02-20 Thread Ajin Cherian
time the information whether it > is > intentionally specified or not is discarded. Then, > GenerateRecoveryConfig() would > just write down all the connection parameters from PGconn, they cannot > recognize. > > Well, one option is that if a default dbname should be written to the configuration file, then "postgres' is a better option than "replication" or "username" as the default option, at least a db of that name exists. regards, Ajin Cherian Fujitsu Australia

Re: Have pg_basebackup write "dbname" in "primary_conninfo"?

2024-02-20 Thread Ajin Cherian
ithout calling that function. This requires connecting to a database. regards, Ajin Cherian Fujitsu Australia

Re: Have pg_basebackup write "dbname" in "primary_conninfo"?

2024-02-20 Thread Ajin Cherian
blank in the generated primary_conninfo string as well. regards, Ajin Cherian Fujitsu Australia

Re: Improve eviction algorithm in ReorderBuffer

2024-02-14 Thread Ajin Cherian
On Sat, Feb 10, 2024 at 2:23 AM Masahiko Sawada wrote: > On Fri, Feb 9, 2024 at 7:35 PM Ajin Cherian wrote: > > > > > > > > On Tue, Feb 6, 2024 at 5:06 PM Masahiko Sawada > wrote: > >> > >> > >> I've attached the new version patch

Re: Improve eviction algorithm in ReorderBuffer

2024-02-09 Thread Ajin Cherian
is function when not necessary? This check was there in code before the patch. patch 0003: +/* + * The threshold of the number of transactions in the max-heap (rb->txn_heap) + * to switch the state. + */ +#define REORDE_BUFFER_MEM_TRACK_THRESHOLD 1024 Typo: I think you meant REORDER_ and not REORDE_ regards, Ajin Cherian Fujitsu Australia

Re: Synchronizing slots from primary to standby

2024-02-04 Thread Ajin Cherian
_syncslot = true so they can receive failover logical slots changes from the primary. regards, Ajin Cherian Fujitsu Australia

Re: Synchronizing slots from primary to standby

2024-01-31 Thread Ajin Cherian
a subscription without failover enabled to make sure that the Subscription with failover disabled does not depend on sync on standby, but this fails because we have failover enabled by default. In summary, I don't think these issues are actual bugs but expected behaviour change. regards, Ajin Cherian Fujitsu Australia

Re: Synchronizing slots from primary to standby

2024-01-23 Thread Ajin Cherian
n_slot('lsub2_slot', 'pgoutput', false, false, true); ?column? -- init (1 row) Time: 36.125 ms postgres=# SELECT 'init' FROM pg_create_logical_replication_slot('lsub1_slot', 'pgoutput', false, false, false); ?column? -- init (1 row) Time: 53.981 ms regards, Ajin Cherian Fujitsu Australia

Re: Synchronizing slots from primary to standby

2023-11-30 Thread Ajin Cherian
of the same name exists, then thereafter no new sync slots are created on standby. Is this expected? I do see that previously created slots are kept up to date, just that no new slots are created after that. regards, Ajin Cherian Fujitsu australia

Re: PATCH: Add REINDEX tag to event triggers

2023-11-27 Thread Ajin Cherian
plicate reindex_event_trigger_collect in indexcmds.c since it is already present in index.c. Add it to index.h and make the function extern so that it can be accessed in both index.c and indexcmds.c regards, Ajin Cherian Fujitsu Australia

Re: Synchronizing slots from primary to standby

2023-11-23 Thread Ajin Cherian
c = MyProc; + +before_shmem_exit(slotsync_worker_detach, (Datum) 0); + +LWLockRelease(SlotSyncWorkerLock); +} before_shmem_exit() can error out leaving the lock acquired. Maybe you should release the lock prior to calling before_shmem_exit() because you don't need the lock there. regards, Ajin Cherian Fujitsu Australia

Re: Synchronizing slots from primary to standby

2023-11-16 Thread Ajin Cherian
is not very good. Neither does it sound like an error, nor is there clarity on what the underlying problem is or how to correct it. regards, Ajin Cherian Fujitsu Australia

Re: Synchronizing slots from primary to standby

2023-11-13 Thread Ajin Cherian
validated locally (either by itself) or by primary_slot > being invalidated, then we should skip the sync. I will fix this in > the next version. Yes, that works. Another bug I see in my testing is that pg_get_slot_invalidation_cause() does not release the LOCK if it finds the slot it is search

Re: Synchronizing slots from primary to standby

2023-11-12 Thread Ajin Cherian
, then the local_slot_update() function is never called to set the invalidation status on the local slot. And for invalidated slots, restart_lsn is always NULL. regards, Ajin Cherian Fujitsu Australia

Re: Synchronizing slots from primary to standby

2023-10-30 Thread Ajin Cherian
27; flag by adding another flag "failover_opt_given". Plugins that set this, will be able to change the failover flag of the slot, while plugins that do not support this will not set this and the failover flag of the created slot will remain. What do you think? regards, Ajin Cherian Fujitsu Australia

Re: Synchronizing slots from primary to standby

2023-10-23 Thread Ajin Cherian
Context will not allow us to change the slot's failover flag using Alter subscription. Currently alter subscription re-establishes the connection using START REPLICATION and failover is one of the options passed in along with START REPLICATION. I am thinking of moving this change to StartLogicalReplication prior to calling CreateDecodingContext by parsing the command options in StartReplicationCmd without adding it to the LogicalDecodingContext. regards, Ajin Cherian Fujitsu Australia

Re: Synchronizing slots from primary to standby

2023-10-15 Thread Ajin Cherian
r (widx = 0; widx < max_slotsync_workers; widx++) > + { > + SlotSyncWorker *worker = &LogicalRepCtx->ss_workers[widx]; > + > + if (worker->hdr.in_use && !worker->dbcount) > + slotsync_worker_stop_internal(worker); > + } > > Is it safe to stop this unguarded by SlotSyncWorkerLock locking? Is > there a window where another dbid decides to reuse this worker at the > same time this process is about to stop it? > == Only the launcher can do this, and there is only one launcher. > ~~~ > > 26. primary_connect > > +/* > + * Connect to primary server for slotsync purpose and return the connection > + * info. Disconnect previous connection if provided in wrconn_prev. > + */ > > /primary server/the primary server/ > == fixed > ~~~ > > 27. > + if (!RecoveryInProgress()) > + return NULL; > + > + if (max_slotsync_workers == 0) > + return NULL; > + > + if (strcmp(synchronize_slot_names, "") == 0) > + return NULL; > + > + /* The primary_slot_name is not set */ > + if (!WalRcv || WalRcv->slotname[0] == '\0') > + { > + ereport(WARNING, > + errmsg("Skipping slots synchronization as primary_slot_name " > +"is not set.")); > + return NULL; > + } > + > + /* The hot_standby_feedback must be ON for slot-sync to work */ > + if (!hot_standby_feedback) > + { > + ereport(WARNING, > + errmsg("Skipping slots synchronization as hot_standby_feedback " > +"is off.")); > + return NULL; > + } > > How come some of these checks giving WARNING that slot synchronization > will be skipped, but others are just silently returning NULL? > == primary_slot_name and hot_standby_feedback are not GUCs exclusive to slot synchronization, they are previously existing - so warning only for them. The others are specific to slot synchronization, so if users set them (which shows that the user intends to use sync-slot), then warning to let the user know that these others also need to be set. > ~~~ > > 28. SaveCurrentSlotSyncConfigs > > +static void > +SaveCurrentSlotSyncConfigs() > +{ > + PrimaryConnInfoPreReload = pstrdup(PrimaryConnInfo); > + PrimarySlotNamePreReload = pstrdup(WalRcv->slotname); > + SyncSlotNamesPreReload = pstrdup(synchronize_slot_names); > +} > > Shouldn't this code also do pfree first? Otherwise these will slowly > leak every time this function is called, right? > == fixed > ~~~ > > 29. SlotSyncConfigsChanged > > +static bool > +SlotSyncConfigsChanged() > +{ > + if (strcmp(PrimaryConnInfoPreReload, PrimaryConnInfo) != 0) > + return true; > + > + if (strcmp(PrimarySlotNamePreReload, WalRcv->slotname) != 0) > + return true; > + > + if (strcmp(SyncSlotNamesPreReload, synchronize_slot_names) != 0) > + return true; > > I felt those can all be combined to have 1 return instead of 3. > == fixed > ~~~ > > 30. > + /* > + * If we have reached this stage, it means original value of > + * hot_standby_feedback was 'true', so consider it changed if 'false' now. > + */ > + if (!hot_standby_feedback) > + return true; > > "If we have reached this stage" seems a bit vague. Can this have some > more explanation? And, maybe also an Assert(hot_standby_feedback); is > helpful in the calling code (before the config is reloaded)? > == rewrote this without that comment. regards, Ajin Cherian Fujitsu Australia

Re: Synchronizing slots from primary to standby

2023-08-15 Thread Ajin Cherian
Then we took > > the average (integer value) of this distance over the span of 10 min > > workload and this is what we got: > > > > I have attached the scripts for schema-setup, running workload and > capturing lag. Please go through Readme for details. > > I did some more tests for 10,20 and 40 slots to calculate the average lsn distance between slots, comparing 1 worker and 3 workers. My results are as follows: 10 slots 1 worker: 5529.75527426 (average lsn distance between primary and standby per slot) 3 worker: 2224.57589134 20 slots 1 worker: 9592.87234043 3 worker: 3194.6293 40 slots 1 worker: 20566.093 3 worker: 7885.80952381 90 slots 1 worker: 36706.8405797 3 worker: 10236.6393162 regards, Ajin Cherian Fujitsu Australia

Re: Support logical replication of DDLs

2023-06-09 Thread Ajin Cherian
objid) > + return true; > > The comment seems wrong/confused – "Only send this ddl if we don't > publish ddl message" (??) > fixed. > == > src/bin/pg_dump/pg_dump.c > > 29. getEventTriggers > > + /* skip internally created event triggers by checking evtisinternal */ > appendPQExpBufferStr(query, > "SELECT e.tableoid, e.oid, evtname, evtenabled, " > "evtevent, evtowner, " > > Uppercase the comment. > fixed. > == > src/include/catalog/pg_event_trigger.h > > 33. > @@ -36,7 +36,7 @@ CATALOG(pg_event_trigger,3466,EventTriggerRelationId) > * called */ > char evtenabled; /* trigger's firing configuration WRT > * session_replication_role */ > - > + bool evtisinternal; /* trigger is system-generated */ > #ifdef CATALOG_VARLEN > text evttags[1]; /* command TAGs this event trigger targets */ > #endif > > ~ > > This change should not remove the blank line that previously existed > before the #ifdef CATALOG_VARLEN. > fixed. > == > src/include/catalog/pg_publication. > > 34. > +/* Publication trigger events */ > +#define PUB_TRIG_DDL_CMD_START "ddl_command_start" > +#define PUB_TRIG_DDL_CMD_END "ddl_command_end" > +#define PUB_TRIG_TBL_REWRITE "table_rewrite" > +#define PUB_TRIG_TBL_INIT_WRITE "table_init_write" > > Elsewhere in PG15 code there are already hardcoded literal strings for > these triggers, so I am wondering if these constants should really be > defined in some common place where everybody can make use of them > instead of having a mixture of string literals and macros for the same > strings. > fixed. > == > src/include/commands/event_trigger.h > > 35. > -extern Oid CreateEventTrigger(CreateEventTrigStmt *stmt); > +extern Oid CreateEventTrigger(CreateEventTrigStmt *stmt, bool isinternal); > extern Oid get_event_trigger_oid(const char *trigname, bool missing_ok); > > IMO a better name is 'is_internal' (Using a snake-case name matches > like the other 'missing_ok') > fixed. > == > src/include/replication/ddlmessage.h > > 36. > + * Copyright (c) 2022, PostgreSQL Global Development Group > > Copyright for the new file should be 2023? > fixed. > == > src/include/tcop/ddldeparse.h > > 37. > * ddldeparse.h > * > * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group > * Portions Copyright (c) 1994, Regents of the University of California > * > * src/include/tcop/ddldeparse.h > > ~ > > I think this is a new file for the feature so why is the copyright > talking about old dates like 1994,1996 etc? > fixed. regards, Ajin Cherian

Re: running logical replication as the subscription owner

2023-05-23 Thread Ajin Cherian
On Mon, May 22, 2023 at 10:36 PM Masahiko Sawada wrote: > > On Wed, May 17, 2023 at 10:10 AM Ajin Cherian wrote: > > > > On Mon, May 15, 2023 at 10:47 PM Masahiko Sawada > > wrote: > > > > > > On Mon, May 15, 2023 at 5:44 PM Ajin Cherian wrote: >

Re: running logical replication as the subscription owner

2023-05-16 Thread Ajin Cherian
On Mon, May 15, 2023 at 10:47 PM Masahiko Sawada wrote: > > On Mon, May 15, 2023 at 5:44 PM Ajin Cherian wrote: > > > > On Fri, May 12, 2023 at 9:55 PM Ajin Cherian wrote: > > > > > > If nobody else is working on this, I can come up with a patch to fix this &

Re: running logical replication as the subscription owner

2023-05-15 Thread Ajin Cherian
On Fri, May 12, 2023 at 9:55 PM Ajin Cherian wrote: > > If nobody else is working on this, I can come up with a patch to fix this > Attaching a patch which attempts to fix this. regards, Ajin Cherian Fujitsu Australia v1-0001-Fix-issue-where-the-copy-command-does-not-adhere-.patch De

Re: running logical replication as the subscription owner

2023-05-12 Thread Ajin Cherian
test should not have access to the table admin_audit. This means the table copy is being invoked as the subscription owner and not the table owner. However, I see subsequent inserts fail on replication with permission denied error, so the apply worker correctly applies the inserts as the table owner. If nobody else is working on this, I can come up with a patch to fix this regards, Ajin Cherian Fujitsu Australia

Re: Support logical replication of DDLs

2023-01-14 Thread Ajin Cherian
*) lfirst(lc1); ^ ddl_deparse.c:8956:31: note: each undeclared identifier is reported only once for each function it appears in ddl_deparse.c:8956:48: error: expected expression before ‘)’ token publication_rel *pub_rel = (publication_rel *) lfirst(lc1); regards, Ajin C

  1   2   3   >