On Wed, Mar 18, 2026 at 4:08 PM Ashutosh Sharma <[email protected]> wrote: > > Hi, > > Thanks again for reviewing the test cases. Please find my comments inline > below: > > On Wed, Mar 18, 2026 at 12:01 PM shveta malik <[email protected]> wrote: > > > > > > Few comments: > > > > 1) > > + > > +# Physical replication slots for the two standbys > > +$primary->safe_psql('postgres', > > + "SELECT pg_create_physical_replication_slot('sb1_slot');"); > > +$primary->safe_psql('postgres', > > + "SELECT pg_create_physical_replication_slot('sb2_slot');"); > > +$primary->safe_psql('postgres', > > + "SELECT pg_create_physical_replication_slot('first_slot');"); > > > > Can we please write a comment atop 'first_slot' creation for its purpose. > > The comment '+# Physical replication slots for the two standbys' > > confuses as we are creating 3 slots following that comment. > > > > Moved this to PART E (the test-case where this slot is being used). > This is also suggested in one of your comments below. > > > 2) > > +# PART A: Plain list (ALL mode) blocks when any slot is unavailable > > > > +# PART C: Verify plain list (ALL mode) requires ALL slots > > > > I think we can merge the 2 tests. Is there a specific reason we have > > it in 2 different sections (am I missing something?). > > > > Part A can first test blocking on inactive standby (which it is doing > > already) and then restart standby and can check that logical decoding > > now proceeds. Part C can then be removed. > > > > I did not remove PART C entirely, however, I removed the portions that > overlapped with PART A. > > I would suggest not just evaluating each test case independently, it > would be useful to also consider the overall flow to understand why > PART B follows PART A, and so on. > > Viewed sequentially, PART A tests a unique scenario while also setting > up the environment needed for PART B (quorum mode). Similarly, PART C > covers a different scenario and prepares the setup for the test case > in PART D. This approach helps avoid unnecessary stop/start cycles of > the synchronous standby. > > > 3) > > Same with these 2: > > +# PART B: ANY mode (quorum) — logical decoding proceeds with N-of-M slots > > > > +# PART D: ANY mode — verify it works with either standby > > > > Part B tests that it works when standby1 is down. > > Part D tests that it works when standby2 is down and also works when > > both are up. > > > > I think we don't need these duplicate test cases (unless I am missing > > something?), we can merge Part D to Part B itself i.e. test with > > standby1 down and up in a single test. > > > > Removed PART D. It was almost similar to PART B. > > > 4) > > These 2 tests are basically testing the same thing. Do we need both? > > +# FIRST 1 should skip sb1_slot (unavailable) and use sb2_slot. > > > > +# Test that FIRST 1 is different from plain list — FIRST 1 succeeds > > with one down. > > > > Yes, they are actually the same. Hence removed the latter one. > > > > > 5) > > I see that we only need 'first_slot' for the last test and do not have > > even a standby associated with it. So we can even move it's creation > > to that test itself instead of creating it in the beginning and adding > > comments to explain 'why'. > > > > This is related to comment 1 above. > > PFA patch that addresses above comments.
I was reviewing the latest patch and here are few comments/suggestions 1. + <para> + The keyword <literal>FIRST</literal>, coupled with + <replaceable class="parameter">num_sync</replaceable>, specifies + priority-based semantics. Logical decoding will wait for the first + <replaceable class="parameter">num_sync</replaceable> available + physical slots in priority order (the order they appear in the list). + If a slot is missing, logical, invalidated, or inactive, it will be + skipped. However, if a slot exists and is valid and active but has + not yet caught up, the system will wait for it rather than skipping + to lower-priority slots. + </para> This explains that it will skip missing, logical, invalidated, or inactive slots while processing the list and deciding on which first N slots to wait for, but I could not understand what would be the behavior if after skipping a few invalid slots the remaining valid slots are less than N, will it proceed for decoding? I think this should be clarified in docs. 2. I see a lot of overlapping code between check_synchronized_standby_slots() and check_synchronous_standby_names(), can we do some refactoring to make a common code? I have just pass through the patch and will try to complete the review soon. -- Regards, Dilip Kumar Google
