On Wednesday, November 26, 2025 2:57 AM Masahiko Sawada <[email protected]> wrote: > > On Tue, Nov 25, 2025 at 4:02 AM Zhijie Hou (Fujitsu) <[email protected]> > wrote: > > > > On Tuesday, November 25, 2025 3:30 AM Masahiko Sawada > <[email protected]> wrote: > > > > > > > > > Given that the computation of xmin and catalog_xmin among all slots > > > could be executed concurrently, could the following scenario happen > > > where > > > procArray->replication_slot_xmin and replication_slot_catalog_xmin > > > procArray->are retreat to a non-invalid > > > XID? > > > > > > 1. Suppose the initial value procArray->replication_slot_catalog_xmin is > 50. > > > 2. Process-A updates its owned slot's catalog_xmin to 100, and > > > computes the new catalog_xmin as 100 while holding > > > ReplicationSlotControlLock in a shared mode in > > > ReplicationSlotsComputeRequiredLSN(). But it doesn't update the > procArray's catalog_xmin value yet. > > > 3. Process-B updates its owned slot's catalog_xmin to 150, and > > > computes the new catalog_xmin as 150. > > > 4. Process-B updates the procArray->replication_slot_catalog_xmin to > 150. > > > 5. Process-A updates the procArray->repilcation_slot_catalog_xmin to > > > 100, which was 150. > > > > After further investigation, I think that steps 3 and 4 cannot occur > > because Process-B must have already encountered the catalog_xmin > > maintained by Process-A, either 50 or 100. Consequently, Process-B > > will refrain from updating the catalog_xmin to a more recent value, such as > 150. > > Right. But the following scenario seems to happen: > > 1. Both processes have a slot with effective_catalog_xmin = 100. > 2. Process-A updates effective_catalog_xmin to 150, and computes the new > catalog_xmin as 100 because process-B slot still has effective_catalog_xmin = > 100. > 3. Process-B updates effective_catalog_xmin to 150, and computes the new > catalog_xmin as 150. > 4. Process-B updates procArray->replication_slot_catalog_xmin to 150. > 5. Process-A updates procArray->replication_slot_catalog_xmin to 100.
I think this scenario can occur, but is not harmful. Because the catalog rows removed prior to xid:150 would no longer be used, as both slots have advanced their catalog_xmin and flushed the value to disk. Therefore, even if replication_slot_catalog_xmin regresses, it should be OK. Considering all above, I think allowing concurrent xmin computation, as the patch does, is acceptable. What do you think ? Best Regards, Hou zj
