Hi,

On 1/19/23 3:46 AM, Andres Freund wrote:
Hi,

On 2023-01-18 11:24:19 +0100, Drouvot, Bertrand wrote:
On 1/6/23 4:40 AM, Andres Freund wrote:
Hm, that's quite expensive. Perhaps worth adding a C helper that can do that
for us instead? This will likely also be needed in real applications after all.


Not sure I got it. What the C helper would be supposed to do?

Call LogStandbySnapshot().


Got it, I like the idea, will do.


With a reload in place in my testing, now I notice that the catalog_xmin
is updated on the primary physical slot after logical slots invalidation
when reloading hot_standby_feedback from "off" to "on".

This is not the case after a re-start (aka catalog_xmin is NULL).

I think a re-start and reload should produce identical behavior on
the primary physical slot. If so, I'm tempted to think that the catalog_xmin
should be updated in case of a re-start too (even if all the logical slots are 
invalidated)
because the slots are not dropped yet. What do you think?

I can't quite follow the steps leading up to the difference. Could you list
them in a bit more detail?



Sure, so with:

1) hot_standby_feedback set to off on the standby
2) create 2 logical replication slots on the standby and activate one
3) Invalidate the logical slots on the standby with VACUUM FULL on the primary
4) change hot_standby_feedback to on on the standby

If:

5) pg_reload_conf() on the standby, then on the primary we get a catalog_xmin
for the physical slot that the standby is attached to:

postgres=# select slot_type,xmin,catalog_xmin  from pg_replication_slots ;
 slot_type | xmin | catalog_xmin
-----------+------+--------------
 physical  |  822 |          748
(1 row)

But if:

5) re-start the standby, then on the primary we get an empty catalog_xmin
for the physical slot that the standby is attached to:

postgres=# select slot_type,xmin,catalog_xmin  from pg_replication_slots ;
 slot_type | xmin | catalog_xmin
-----------+------+--------------
 physical  |  816 |
(1 row)


Can we do something cheaper than rewriting the entire database? Seems
rewriting a single table ought to be sufficient?


While implementing the test at the table level I discovered that It looks like there is 
no guarantee that say a "vacuum full pg_class;" would
produce a conflict.

I assume that's mostly when there weren't any removal


Indeed, from what I can see in my testing it could generate a XLOG_HEAP2_PRUNE 
with snapshotConflictHorizon to 0:

"rmgr: Heap2       len (rec/tot):    107/   107, tx:        848, lsn: 0/03B98B30, 
prev 0/03B98AF0, desc: PRUNE snapshotConflictHorizon 0"


Having a snapshotConflictHorizon to zero leads to 
ResolveRecoveryConflictWithSnapshot() simply returning
without any conflict handling.

That doesn't have to mean anything bad. Some row versions can be removed without
creating a conflict. See HeapTupleHeaderAdvanceConflictHorizon(), specifically

         * Ignore tuples inserted by an aborted transaction or if the tuple was
         * updated/deleted by the inserting transaction.



It does look like that in the standby decoding case that's not the right 
behavior (and that the xid that generated the PRUNING should be used instead)
, what do you think?

That'd not work, because that'll be typically newer than the catalog_xmin. So
we'd start invalidating things left and right, despite not needing to.



Okay, thanks for the explanations that makes sense.

Did you see anything else around this making you suspicious?


No, but a question still remains to me:

Given the fact that the row removal case is already done
in the next test (aka Scenario 2), If we want to replace the "vacuum full" test
on the database (done in Scenario 1) with a cheaper one at the table level,
what could it be to guarantee an invalidation?

Same as scenario 2 but with "vacuum full pg_class" would not really add value
to the tests, right?

+##################################################
+# Test standby promotion and logical decoding behavior
+# after the standby gets promoted.
+##################################################
+

I think this also should test the streaming / walsender case.


Do you mean cascading standby?

I mean a logical walsender that starts on a standby and continues across
promotion of the standby.


Got it, thanks, will do.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com


Reply via email to