RE: Fix slot synchronization with two_phase decoding enabled

Zhijie Hou (Fujitsu) Mon, 21 Apr 2025 21:00:30 -0700

On Tue, Apr 22, 2025 at 11:23 AM Amit Kapila wrote:

> 
> On Fri, Apr 18, 2025 at 9:58 AM Amit Kapila <[email protected]> wrote:
> >
> > On Thu, Apr 17, 2025 at 6:14 PM Zhijie Hou (Fujitsu)
> > <[email protected]> wrote:
> > >
> > > -----
> > > Fix
> > > -----
> > >
> > > I think we should keep the confirmed_flush even if the previous
> > > synced restart_lsn/catalog_xmin is newer. Attachments include a patch
> for the same.
> > >
> >
> > This will fix the case we are facing but adds a new rule for slot
> > synchronization. Can we think of a simpler way to fix this by avoiding
> > updating other slot fields (like two_phase, two_phase_at) if
> > restart_lsn or catalog_xmin of the local slot is ahead of the remote
> > slot?
> >
> 
> Thinking more about this problem, it seems to me that if the catalog_xmin of
> synced slot is allowed to be ahead than the remote_slot when there is still an
> open (prepared transaction), it could cause data loss.  I mean that after the
> promotion, some of the required catalog rows could be removed, and decoding
> corresponding changes (changes from tables affected by DDL) could give
> unexpected results. Those would be protected on primary/publisher because
> the catalog_xmin on it was still accurate and behind. If this theory turns 
> out to
> be true, then this is a drawback/bug of the existing fast_forward mode code.


I agree that could be a problem.

Upon further analysis, I find that the core problem is we do not build a base
snapshot when decoding changes during fast forward mode, preventing reference
to the minimum transaction ID that remains visible in the snapshot when
determining the candidate for catalog_xmin. As a result, catalog_xmin was
directly advanced to the oldest running transaction ID found in the latest
running_xacts record.

In code-level, SnapBuildProcessChange -> ReorderBufferSetBaseSnapshot could not
be reached during fast forward decoding, resulting in
rb->txns_by_base_snapshot_lsn being NULL. When advancing catalog_xmin, the
system attempts to refer to rb->txns_by_base_snapshot_lsn via
ReorderBufferGetOldestXmin(). However, since rb->txns_by_base_snapshot_lsn is
NULL, it defaults to directly using running->oldestRunningXid as the candidate
for catalog_xmin. For reference, see the implementation details in
SnapBuildProcessRunningXacts.

I think this is a general issue in fast forward decoding, which not only affect
slotsync. I can reproduce the issue using slot_advance SQL API as well. See the
attachment for a patch that includes a test to prove that the catalog data that
are still required would be removed after premature catalog_xmin advancement
during fast forward decoding. If you test that patch on HEAD, you would find
that the output missed a column due to vacuum removal.

I will start a new thread to report and fix this general.

Best Regards,
Hou zj

0001-Add-a-test.patch
Description: 0001-Add-a-test.patch

RE: Fix slot synchronization with two_phase decoding enabled

Reply via email to