On 2017-10-26 22:01, Sokolov Yura wrote:
On 2017-09-27 14:46, Stas Kelvich wrote:
On 7 Sep 2017, at 18:58, Nikhil Sontakke <nikh...@2ndquadrant.com> wrote:

Hi,

FYI all, wanted to mention that I am working on an updated version of
the latest patch that I plan to submit to a later CF.


Cool!

So what kind of architecture do you have in mind? Same way as is it
was implemented before?
As far as I remember there were two main issues:

* Decodong of aborted prepared transaction.

If such transaction modified catalog then we can’t read reliable info
with our historic snapshot,
since clog already have aborted bit for our tx it will brake
visibility logic. There are some way to
deal with that — by doing catalog seq scan two times and counting
number of tuples (details
upthread) or by hijacking clog values in historic visibility function.
But ISTM it is better not solve this
issue at all =) In most cases intended usage of decoding of 2PC
transaction is to do some form
of distributed commit, so naturally decoding will happens only with
in-progress transactions and
we commit/abort will happen only after it is decoded, sent and
response is received. So we can
just have atomic flag that prevents commit/abort of tx currently being
decoded. And we can filter
interesting prepared transactions based on GID, to prevent holding
this lock for ordinary 2pc.

* Possible deadlocks that Andres was talking about.

I spend some time trying to find that, but didn’t find any. If locking
pg_class in prepared tx is the only
example then (imho) it is better to just forbid to prepare such
transactions. Otherwise if some realistic
examples that can block decoding are actually exist, then we probably
need to reconsider the way
tx being decoded. Anyway this part probably need Andres blessing.

Just rebased patch logical_twophase_v6 to master.

Fixed small issues:
- XactLogAbortRecord wrote DBINFO twice, but it was decoded in
  ParseAbortRecord only once. Second DBINFO were parsed as ORIGIN.
  Fixed by removing second write of DBINFO.
- SnapBuildPrepareTxnFinish tried to remove xid from `running` instead
  of `committed`. And it removed only xid, without subxids.
- test_decoding skipped returning "COMMIT PREPARED" and "ABORT PREPARED",

Big issue were with decoding ddl-including two-phase transactions:
- prepared.out were misleading. We could not reproduce decoding body of
  "test_prepared#3" with logical_twophase_v6.diff. It was skipped if
  `pg_logical_slot_get_changes` were called without
  `twophase-decode-with-catalog-changes` set, and only "COMMIT PREPARED
  test_prepared#3" were decoded.
The reason is "PREPARE TRANSACTION" is passed to `pg_filter_prepare`
twice:
- first on "PREPARE" itself,
- second - on "COMMIT PREPARED".
In v6, `pg_filter_prepare` without `with-catalog-changes` first time
answered "true" (ie it should not be decoded), and second time (when
transaction became committed) it answered "false" (ie it should be
decoded). But second time in DecodePrepare
`ctx->snapshot_builder->start_decoding_at`
is already in a future compared to `buf->origptr` (because it is
on "COMMIT PREPARED" lsn). Therefore DecodePrepare just called
ReorderBufferForget.
If `pg_filter_prepare` is called with `with-catalog-changes`, then
it returns "false" both times, thus DeocdePrepare decodes transaction
in first time, and calls `ReorderBufferForget` in second time.

I didn't found a way to fix it gracefully. I just change `pg_filter_prepare` to return same answer both times: "false" if called `with-catalog-changes`
(ie need to call DecodePrepare), and "true" otherwise. With this
change, catalog changing two-phase transaction is decoded as simple
one-phase transaction, if `pg_logical_slot_get_changes` is called
without `with-catalog-changes`.

Small improvement compared to v7:
- twophase_gid is written with alignment padding in the XactLogCommitRecord
and XactLogAbortRecord.

--
Sokolov Yura
Postgres Professional: https://postgrespro.ru
The Russian Postgres Company

Attachment: logical_twophase_v8.diff.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to