Hi all,

I would like to share a logical replication bug and some possible fixes. It 
seems that this bug has existed since
logical replication was first introduced, so it has been around for quite some 
time. In fact, the previously
reported issues [1], [2], [3] were all caused by this bug.


# Problem description


When in the BUILDING_SNAPSHOT state, the snapshot builder does not track the 
status of any
transaction. It can lead to missing transaction states when:
-- The transaction commits before the builder reaches FULL_SNAPSHOT state, and
-- The transaction's xid is greater than or equal to builder->xmin when the 
builder reaches
FULL_SNAPSHOT state.


Once in FULL_SNAPSHOT state, the builder constructs a base snapshot using 
incomplete transaction state
information. This results in an incorrect base snapshot, which can cause 
unpredictable behavior during
subsequent decoding. The case provided in v6-0002 attachment reproduces the 
issue (provided by ChangAo Chen).


# Code-level analysis


SnapBuildCommitTxn does consider transaction processing during the 
BUILDING_SNAPSHOT state. However, it
is only called from xact_decode -> DecodeCommit. xact_decode does not process 
any xact record when snapshot
builder have not yet reached the FULL_SNAPSHOT state, meaning those commits are 
ignored. Similarly, other
functions marking transaction having catalog changes (e.g., heap2_decode) also 
do not handle records before
reaching the FULL_SNAPSHOT state.


# Possible fixes


1. Replace snapshot at the time we reach CONSISTENT state.


Ajin Cherian in [4] and my initial thought was that although the snapshot at 
FULL_SNAPSHOT state might be
wrong, the snapshot at CONSISTENT state is guaranteed to be correct. Since 
decoding always starts after
reaching CONSISTENT state, we could update both the reorder buffer and the 
builder snapshot with the one
captured at CONSISTENT state. However, IMUC, this would still cause changes 
generated before CONSISTENT to
carry a wrong snapshot (see SnapBuildDistributeSnapshotAndInval).


2. Track transactions during BUILDING_SNAPSHOT state for snapshot builder
If the builder does not track transactions in BUILDING_SNAPSHOT state, then we 
make it track them.


1) ChangAo Chen in v6-0001 attachment provided a fix, already reviewed by 
several people (including me).
Bertrand Drouvot in [5] considered the logic a bit messy. And I prefer we 
should make the behavior of
snapshot building similar in both BUILDING_SNAPSHOT and FULL_SNAPSHOT states, 
except in cases where
a base snapshot is needed.


2) Based on v6-0001, I have provided a minimal fix in v6-0003 (not yet 
reviewed). AFAICS, it resolves
the problem, though it records additional useless information in the reorder 
buffer during BUILDING_SNAPSHOT
state (which is discarded later). This increases memory usage and slightly 
impacts performance. But since
snapshot building is infrequent, I consider this acceptable.


3) I have also prepared a cleaner and more efficient fix in v6-0004 than 
v6-0003, albeit more complex
(similar to v6-0001). Provided as an alternative reference.


I think we should fix this issue to ensure snapshot building is correct.
Looking forward to your reviews and any feedback on the above proposed 
solutions.


Best regards,
Haiyang Li




[1] 
https://www.postgresql.org/message-id/tencent_6AAF072A7623A11A85C0B5FD290232467808%40qq.com
[2] https://www.postgresql.org/message-id/[email protected]
[3] 
https://www.postgresql.org/message-id/2b9e5ac8.136f.19a8f7297ee.Coremail.ocean_li_996%40163.com
[4] 
https://www.postgresql.org/message-id/CAFPTHDYSQipcO_%2BGNt-ZQsk6cidt9Lc4PkcdvO7jnrugiUw0eg%40mail.gmail.com
 
[5] 
https://www.postgresql.org/message-id/ZrnlgJEH473Q1kTp%40ip-10-97-1-34.eu-west-3.compute.internal

Attachment: v6-0002-Add-test-case-snapshot_build-for-test_decoding.patch
Description: Binary data

Attachment: v6-0001-Track-transactions-committed-in-BUILDING_SNAPSHOT.patch
Description: Binary data

Attachment: v6-0003-Track-transaction-committed-in-BUILDING_SNAPSHOT.patch
Description: Binary data

Attachment: v6-0004-Track-transaction-committed-in-BUILDING_SNAPSHOT.patch
Description: Binary data

Reply via email to