Hi,

while working on logical decoding of sequences, I ran into an issue with nextval() in a transaction that rolls back, described in [1]. But after thinking about it a bit more (and chatting with Petr Jelinek), I think this issue affects physical sync replication too.

Imagine you have a primary <-> sync_replica cluster, and you do this:

  CREATE SEQUENCE s;

  -- shutdown the sync replica

  BEGIN;
  SELECT nextval('s') FROM generate_series(1,50);
  ROLLBACK;

  BEGIN;
  SELECT nextval('s');
  COMMIT;

The natural expectation would be the COMMIT gets stuck, waiting for the sync replica (which is not running), right? But it does not.

The problem is exactly the same as in [1] - the aborted transaction generated WAL, but RecordTransactionAbort() ignores that and does not update LogwrtResult.Write, with the reasoning that aborted transactions do not matter. But sequences violate that, because we only write WAL once every 32 increments, so the following nextval() gets "committed" without waiting for the replica (because it did not produce WAL).

I'm not sure this is a clear data corruption bug, but it surely walks and quacks like one. My proposal is to fix this by tracking the lsn of the last LSN for a sequence increment, and then check that LSN in RecordTransactionCommit() before calling XLogFlush().


regards


[1] https://www.postgresql.org/message-id/ae3cab67-c31e-b527-dd73-08f196999ad4%40enterprisedb.com

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Reply via email to