On Wed, Jan 13, 2021 at 12:11 AM Amit Kapila <amit.kapil...@gmail.com> wrote:
> Thanks for doing these tests. I think you can put an elog in the below > code change as well to show that the recovery code path is also hit: > > +xact_redo_abort(xl_xact_parsed_abort *parsed, TransactionId xid, > + XLogRecPtr lsn, RepOriginId origin_id) > { > ... > + if (parsed->xinfo & XACT_XINFO_HAS_ORIGIN) > + { > + /* recover apply progress */ > + replorigin_advance(origin_id, parsed->origin_lsn, lsn, > + false /* backward */, false /* WAL */); > + } I added the below log: @@ -2299,6 +2309,14 @@ RecordTransactionAbortPrepared(TransactionId xid, MyXactFlags | XACT_FLAGS_ACQUIREDACCE xid, gid); + if (replorigin) + { + /* Move LSNs forward for this replication origin */ + replorigin_session_advance(replorigin_session_origin_lsn, + XactLastRecEnd); + elog(LOG,"advance replorigin for abort prepared"); + } + /* Always flush, since we're about to remove the 2PC state file */ XLogFlush(recptr); Then on server1: postgres=# begin; BEGIN postgres=*# insert into tab values (5); INSERT 0 1 postgres=*# prepare transaction 'test'; PREPARE TRANSACTION postgres=# rollback prepared 'test'; ROLLBACK PREPARED And I immediately stopped server 2 to prevent checkpoint: pg_ctl stop -m immediate and restarted server 2: I got the following logs: 2021-01-13 02:14:40.297 EST [4851] LOG: listening on IPv6 address "::1", port 54321 2021-01-13 02:14:40.297 EST [4851] LOG: listening on IPv4 address "127.0.0.1", port 54321 2021-01-13 02:14:40.323 EST [4851] LOG: listening on Unix socket "/tmp/.s.PGSQL.54321" 2021-01-13 02:14:40.333 EST [4852] LOG: database system was interrupted; last known up at 2021-01-13 02:04:08 EST 2021-01-13 02:14:40.842 EST [4852] LOG: recovered replication state of node 1 to 0/1606C60 2021-01-13 02:14:40.843 EST [4852] LOG: database system was not properly shut down; automatic recovery in progress 2021-01-13 02:14:40.850 EST [4852] LOG: redo starts at 0/160BFE8 2021-01-13 02:14:40.850 EST [4852] LOG: recover apply progress in red abort <==== the log added for this 2021-01-13 02:14:40.850 EST [4852] CONTEXT: WAL redo at 0/160C0E8 for Transaction/ABORT_PREPARED: 533: 2021-01-13 02:14:20.911933-05 2021-01-13 02:14:40.853 EST [4852] LOG: invalid record length at 0/160C160: wanted 24, got 0 2021-01-13 02:14:40.853 EST [4852] LOG: redo done at 0/160C128 system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s 2021-01-13 02:14:40.868 EST [4851] LOG: database system is ready to accept connections 2021-01-13 02:14:40.897 EST [4887] LOG: logical replication apply worker for subscription "tap_sub" has started I see the same logs are seen in the test cases that have been added as part of https://www.postgresql.org/message-id/caa4ek1l3p4z+9wtk77mbdpkagr4gs2y3r1je7zevlqvf9ga...@mail.gmail.com regards, Ajin Cherian Fujitsu Australia .