Re: [HACKERS] logical decoding of two-phase transactions

Craig Ringer Thu, 02 Mar 2017 04:24:37 -0800

On 2 March 2017 at 16:20, Stas Kelvich <s.kelv...@postgrespro.ru> wrote:
>
>> On 2 Mar 2017, at 11:00, Craig Ringer <cr...@2ndquadrant.com> wrote:
>>
>> We already have it, because we just decoded the PREPARE TRANSACTION.
>> I'm preparing a patch revision to demonstrate this.
>
> Yes, we already have it, but if server reboots between commit prepared (all
> prepared state is gone) and decoding of this commit prepared then we loose
> that mapping, isn’t it?


I was about to explain how restart_lsn works again, and how that would
mean we'd always re-decode the PREPARE TRANSACTION before any COMMIT
PREPARED or ROLLBACK PREPARED on crash. But...

Actually, the way you've implemented it, that won't be the case. You
treat PREPARE TRANSACTION as a special-case of COMMIT, and the client
will presumably send replay confirmation after it has applied the
PREPARE TRANSACTION. In fact, it has to if we want 2PC to work with
synchronous replication. This will allow restart_lsn to advance to
after the PREPARE TRANSACTION record if there's no other older xact
and we see a suitable xl_running_xacts record. So we wouldn't decode
the PREPARE TRANSACTION again after restart.

Hm.

That's actually a pretty good reason to xlog the gid for 2pc rollback
and commit if we're at wal_level >= logical . Being able to advance
restart_lsn and avoid the re-decoding work is a big win.

Come to think of it, we have to advance the client replication
identifier as part of PREPARE TRANSACTION anyway, otherwise we'd try
to repeat and re-prepare the same xact on crash recovery.

Given that, I withdraw my objection to adding the gid to commit and
rollback xlog records, though it should only be done if they're 2pc
commit/abort, and only if XLogLogicalInfoActive().


>> BTW, I've been reviewing the patch in more detail. Other than a bunch
>> of copy-and-paste that I'm cleaning up, the main issue I've found is
>> that in DecodePrepare, you call:
>>
>>    SnapBuildCommitTxn(ctx->snapshot_builder, buf->origptr, xid,
>>                       parsed->nsubxacts, parsed->subxacts);
>>
>> but I am not convinced it is correct to call it at PREPARE TRANSACTION
>> time, only at COMMIT PREPARED time. We want to see the 2pc prepared
>> xact's state when decoding it, but there might be later commits that
>> cannot yet see that state and shouldn't have it visible in their
>> snapshots.
>
> Agree, that is problem. That allows to decode this PREPARE, but after that
> it is better to mark this transaction as running in snapshot or perform 
> prepare
> decoding with some kind of copied-end-edited snapshot. I’ll have a look at 
> this.

Thanks.

It's also worth noting that with your current approach, 2PC xacts will
produce two calls to the output plugin's commit() callback, once for
the PREPARE TRANSACTION and another for the COMMIT PREPARED or
ROLLBACK PREPARED, the latter two with a faked-up state. I'm not a
huge fan of that. It's not entirely backward compatible since it
violates the previously safe assumption that there's a 1:1
relationship between begin and commit callbacks with no interleaving,
for one thing, and I think it's also a bit misleading to send a
PREPARE TRANSACTION to a callback that could previously only receive a
true commit.

I particularly dislike calling a commit callback for an abort. So I'd
like to look further into the interface side of things. I'm inclined
to suggest adding new callbacks for 2pc prepare, commit and rollback,
and if the output plugin doesn't set them fall back to the existing
behaviour. Plugins that aren't interested in 2PC (think ETL) should
probably not have to deal with it, we might as well just send them
only the actually committed xacts, when they commit.

-- 
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical decoding of two-phase transactions

Reply via email to