On Wed, Aug 10, 2016 at 6:09 PM, Heikki Linnakangas <hlinn...@iki.fi> wrote:
> Hmm. There's one more possible way this could all work. Let's have CSN ==
> LSN, also for asynchronous commits. A snapshot is the current insert
> position, but also make note of the current flush position, when you take a
> snapshot. Now, when you use the snapshot, if you ever see an XID that
> committed between the snapshot's insert position and the flush position,
> wait for the WAL to be flushed up to the snapshot's insert position at that
> point. With that scheme, an asynchronous commit could return to the
> application without waiting for a flush, but if someone actually looks at
> the changes the transaction made, then that transaction would have to wait.
> Furthermore, we could probably skip that waiting too, if the reading
> transaction is also using synchronous_commit=off.
> That's slightly different from the current behaviour. A transaction that
> runs with synchronous_commit=on, and reads data that was modified by an
> asynchronous transaction, would take a hit. But I think that would be
> acceptable.

My proposal of vector clocks would allow for pretty much exactly
current behavior.

To simplify, there would be lastSyncCommitSeqNo and
lastAsyncCommitSeqNo variables in ShmemVariableCache. Transaction
commit would choose which one to update based on synchronous_commit
setting and embed the value of the setting into CSN log. Snapshots
would contain both values, when checking for CSN visibility use the
value of the looked up synchronous_commit setting to decide which
value to compare against. Standby's replaying commit records would
just update both values, resulting in transactions becoming visible in
xlog order, as they do today. The scheme would allow for inventing a
new xlog record/replication message communicating visibility ordering.

However I don't see why inventing a separate CSN concept is a large
problem. Quite the opposite, unless there is a good reason that I'm
missing, it seems better to not unnecessarily conflate commit record
durability and transaction visibility ordering. Not having them tied
together allows for an external source to provide CSN values, allowing
for interesting distributed transaction implementations. E.g. using a
timestamp as the CSN a'la Google Spanner and the TrueTime API.

Ants Aasma

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to