On Wednesday, November 26, 2025 2:57 AM Masahiko Sawada <[email protected]> 
wrote:
> 
> On Tue, Nov 25, 2025 at 4:02 AM Zhijie Hou (Fujitsu) <[email protected]>
> wrote:
> >
> > On Tuesday, November 25, 2025 3:30 AM Masahiko Sawada
> <[email protected]> wrote:
> > >
> > >
> > > Given that the computation of xmin and catalog_xmin among all slots
> > > could be executed concurrently, could the following scenario happen
> > > where
> > > procArray->replication_slot_xmin and replication_slot_catalog_xmin
> > > procArray->are retreat to a non-invalid
> > > XID?
> > >
> > > 1. Suppose the initial value procArray->replication_slot_catalog_xmin is
> 50.
> > > 2. Process-A updates its owned slot's catalog_xmin to 100, and
> > > computes the new catalog_xmin as 100 while holding
> > > ReplicationSlotControlLock in a shared mode in
> > > ReplicationSlotsComputeRequiredLSN(). But it doesn't update the
> procArray's catalog_xmin value yet.
> > > 3. Process-B updates its owned slot's catalog_xmin to 150, and
> > > computes the new catalog_xmin as 150.
> > > 4. Process-B updates the procArray->replication_slot_catalog_xmin to
> 150.
> > > 5. Process-A updates the procArray->repilcation_slot_catalog_xmin to
> > > 100, which was 150.
> >
> > After further investigation, I think that steps 3 and 4 cannot occur
> > because Process-B must have already encountered the catalog_xmin
> > maintained by Process-A, either 50 or 100. Consequently, Process-B
> > will refrain from updating the catalog_xmin to a more recent value, such as
> 150.
> 
> Right. But the following scenario seems to happen:
> 
> 1. Both processes have a slot with effective_catalog_xmin = 100.
> 2. Process-A updates effective_catalog_xmin to 150, and computes the new
> catalog_xmin as 100 because process-B slot still has effective_catalog_xmin =
> 100.
> 3. Process-B updates effective_catalog_xmin to 150, and computes the new
> catalog_xmin as 150.
> 4. Process-B updates procArray->replication_slot_catalog_xmin to 150.
> 5. Process-A updates procArray->replication_slot_catalog_xmin to 100.

I think this scenario can occur, but is not harmful. Because the catalog rows
removed prior to xid:150 would no longer be used, as both slots have advanced
their catalog_xmin and flushed the value to disk. Therefore, even if
replication_slot_catalog_xmin regresses, it should be OK.

Considering all above, I think allowing concurrent xmin computation, as the
patch does, is acceptable. What do you think ?

Best Regards,
Hou zj

Reply via email to