Hi all

My earlier efforts at logical decoding of sequence advances were too
simplistic[1], falling afoul of issues with sequences being both
transactional and not transactional depending on whether the sequence is
created in the current xact or not.

TL;DR of solution:

* extend SeqTableData and xl_seq_rec with xid
* set xid in SeqTableData during DefineSequence()
* if SeqTableData set, write it in xl_seq_rec when writing xlog
* assign sequence update to specified xid's reorder buffer in decoding if
xid set, otherwise decode immediately


The problem:

If the sequence is created in the current xact (i.e. uncommitted) we have
to add the sequence updates to that xact to be replayed only if it commits.
The sequence is visible only to the toplevel xact that created the sequence
so advances of it can only come from that xact and its children. The actual
CREATE SEQUENCE is presumed to be handled separately by an event trigger or
similar.

If the new sequence is committed we must replay sequence advances
immediately and non-transactionally to ensure they're not lost due to xact
rollback or replayed in the wrong order due to xact commit order.

If the sequence is ALTERed in a way that changes pg_class that's event
triggers' job and sequence decoding doesn't care. If it's ALTERed in a way
that changes Form_pg_sequence we replay the change immediately, using the
last committed snapshot to get the sequence details, so the change will
take immediate effect and is retained whether or not any pg_class changes
are committed. This reflects how it happens on the upstream.


Planned solution:

Extend xl_seq_rec with a created_in_xid TransactionId field. If
created_in_xid != InvalidTransactionId, logical decoding associates the
sequence advance with the given toplevel xact and adds it to the reorder
buffer instead of immediately invoking the sequence decoding callback. The
decoding callback then gets invoked during ReorderBufferCommit processing
at the appropriate time, like any other transactional change.

To determine whether to log an xid for the sequence advance we need some
backend local state to determine whether the sequence is new in this xact.
Handily we already have one, the seqhashtab of SeqTableData in sequence.c,
just where it's needed. So all that's needed is to add a TransactionId
field that we set if we created that sequence in this session. If it's set
we test it for TransactionIsInProgress() when xlog'ing a sequence advance;
if it is, log that xid. If not in progress, clear the xid in SeqTableData
entry so we don't check again.



Another approach would be, during decoding, to look up the relfilenode of
the sequence to get the sequence oid and do a pg_class lookup. Check to see
whether xmin is part of an in-progress xact. If so, add the sequence
advance to that xact's reorder buffer, otherwise decode it immediately. The
problem is that

(a) I think we lack relfilenode-to-oid mapping information at
decoding-time. RelidByRelfilenode() needs a snapshot and is invoked
during ReorderBufferCommit(). We have make the transactional vs
nontransactional decision in LogicalDecodingProcessRecord() when I'm pretty
sure we don't have a snapshot.

(b) It also has issues with ALTER TRANSACTION. We must replay decoded xact
updates immediately even if some in-flight xact has modified the pg_class
entry for the sequence. So we can't just check whether the xmin is one of
our xact's (sub)xids, we must also check whether some older tuple for the
same sequence oid has a corresponding xmax and keep walking backwards until
we determine whether we originally CREATEd the sequence in this xact or
only ALTERed it.


So yeah. I think extending SeqTableData and xl_seq_rec with xid is the way
to go. Objections?

-- 
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Reply via email to