I have faced two issues with logical replication.
Repro scenario:

1. start two Postgres instances (I start both at local machine).
2. Initialize pgbench tables at both instances:
    pgbench -i -s 10 postgres
3. Create publication of pgbench_accounts table at one node:
    create publication pub for table pgbench_accounts;
4. Create subscription at another node:
    delete from pgbench_accounts;
CREATE SUBSCRIPTION sub connection 'dbname=postgres host=localhost port=5432 sslmode=disable' publication pub; CREATE OR REPLACE FUNCTION replication_trigger_proc() RETURNS TRIGGER AS $$
    BEGIN
    --  RETURN NEW;
    END $$ LANGUAGE plpgsql;
CREATE TRIGGER replication_trigger BEFORE INSERT OR UPDATE OR DELETE ON pgbench_accounts FOR EACH ROW EXECUTE PROCEDURE replication_trigger_proc();
   ALTER TABLE pgbench_accounts ENABLE ALWAYS TRIGGER replication_trigger;
5. Start pgbench at primary node:
    pgbench -T 1000 -P 2 -c 10 postgres


Please notice commented "return new" statement. Invocation of this function cause error and in log we see repeated messages:

2017-10-02 16:47:46.764 MSK [32129] LOG: logical replication table synchronization worker for subscription "sub", table "pgbench_accounts" has started 2017-10-02 16:47:46.771 MSK [32129] ERROR: control reached end of trigger procedure without RETURN 2017-10-02 16:47:46.771 MSK [32129] CONTEXT: PL/pgSQL function replication_trigger_proc()
    COPY pgbench_accounts, line 1: "1    1    0 "
2017-10-02 16:47:46.772 MSK [28020] LOG: worker process: logical replication worker for subscription 17264 sync 17251 (PID 32129) exited with exit code 1
...


After few minutes of work primary node (with publication) is crashed with the following stack trace:

Program terminated with signal SIGABRT, Aborted.
#0 0x00007f3608f8ec37 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
56    ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0 0x00007f3608f8ec37 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1  0x00007f3608f92028 in __GI_abort () at abort.c:89
#2 0x00000000009f5740 in ExceptionalCondition (conditionName=0xbf6b30 "!(((xid) != ((TransactionId) 0)))", errorType=0xbf69af "FailedAssertion", fileName=0xbf69a8 "lmgr.c", lineNumber=582) at assert.c:54 #3 0x000000000086ac1d in XactLockTableWait (xid=0, rel=0x0, ctid=0x0, oper=XLTW_None) at lmgr.c:582 #4 0x000000000081c9f7 in SnapBuildWaitSnapshot (running=0x2749198, cutoff=898498) at snapbuild.c:1400 #5 0x000000000081c7c7 in SnapBuildFindSnapshot (builder=0x2807910, lsn=798477760, running=0x2749198) at snapbuild.c:1311 #6 0x000000000081c339 in SnapBuildProcessRunningXacts (builder=0x2807910, lsn=798477760, running=0x2749198) at snapbuild.c:1106 #7 0x000000000080a1c7 in DecodeStandbyOp (ctx=0x27ef870, buf=0x7ffd301858d0) at decode.c:314 #8 0x0000000000809ce1 in LogicalDecodingProcessRecord (ctx=0x27ef870, record=0x27efb30) at decode.c:117 #9 0x000000000080ddf9 in DecodingContextFindStartpoint (ctx=0x27ef870) at logical.c:458 #10 0x0000000000823968 in CreateReplicationSlot (cmd=0x27483a8) at walsender.c:934
#11 0x00000000008246ee in exec_replication_command (
cmd_string=0x27b9520 "CREATE_REPLICATION_SLOT \"sub_17264_sync_17251\" TEMPORARY LOGICAL pgoutput USE_SNAPSHOT") at walsender.c:1511 #12 0x000000000088eb44 in PostgresMain (argc=1, argv=0x275b738, dbname=0x275b578 "postgres", username=0x272b800 "knizhnik")
    at postgres.c:4086
#13 0x00000000007ef9a1 in BackendRun (port=0x27532a0) at postmaster.c:4357
#14 0x00000000007ef0cb in BackendStartup (port=0x27532a0) at postmaster.c:4029
#15 0x00000000007eb68b in ServerLoop () at postmaster.c:1753
#16 0x00000000007eac77 in PostmasterMain (argc=3, argv=0x2729670) at postmaster.c:1361
#17 0x0000000000728552 in main (argc=3, argv=0x2729670) at main.c:228


Now fix the trigger function:
CREATE OR REPLACE FUNCTION replication_trigger_proc() RETURNS TRIGGER AS $$
BEGIN
  RETURN NEW;
END $$ LANGUAGE plpgsql;

And manually perform at master two updates inside one transaction:

postgres=# begin;
BEGIN
postgres=# update pgbench_accounts set abalance=abalance+1 where aid=26;
UPDATE 1
postgres=# update pgbench_accounts set abalance=abalance-1 where aid=26;
UPDATE 1
postgres=# commit;
<hangs>

and in replcas log we see:
2017-10-02 18:40:26.094 MSK [2954] LOG: logical replication apply worker for subscription "sub" has started
2017-10-02 18:40:26.101 MSK [2954] ERROR:  attempted to lock invisible tuple
2017-10-02 18:40:26.102 MSK [2882] LOG: worker process: logical replication worker for subscription 16403 (PID 2954) exited with exit code 1

Error happens in trigger.c:
#3 0x000000000069bddb in GetTupleForTrigger (estate=0x2e36b50, epqstate=0x7ffc4420eda0, relinfo=0x2dcfe90, tid=0x2dd08ac, lockmode=LockTupleNoKeyExclusive, newSlot=0x7ffc4420ec40) at trigger.c:3103 #4 0x000000000069b259 in ExecBRUpdateTriggers (estate=0x2e36b50, epqstate=0x7ffc4420eda0, relinfo=0x2dcfe90, tupleid=0x2dd08ac,
    fdw_trigtuple=0x0, slot=0x2dd0240) at trigger.c:2748
#5 0x00000000006d2395 in ExecSimpleRelationUpdate (estate=0x2e36b50, epqstate=0x7ffc4420eda0, searchslot=0x2dd0358, slot=0x2dd0240)
    at execReplication.c:461
#6 0x0000000000820894 in apply_handle_update (s=0x7ffc442163b0) at worker.c:736
#7  0x0000000000820d83 in apply_dispatch (s=0x7ffc442163b0) at worker.c:892
#8 0x000000000082121b in LogicalRepApplyLoop (last_received=2225693736) at worker.c:1095
#9  0x000000000082219d in ApplyWorkerMain (main_arg=0) at worker.c:1647
#10 0x00000000007dd496 in StartBackgroundWorker () at bgworker.c:835
#11 0x00000000007f0889 in do_start_bgworker (rw=0x2d75f50) at postmaster.c:5680
#12 0x00000000007f0bc3 in maybe_start_bgworkers () at postmaster.c:5884
#13 0x00000000007efbeb in sigusr1_handler (postgres_signal_arg=10) at postmaster.c:5073
#14 <signal handler called>
#15 0x00007fbe83a054a3 in __select_nocancel () at ../sysdeps/unix/syscall-template.S:81
#16 0x00000000007eb517 in ServerLoop () at postmaster.c:1717
#17 0x00000000007eac48 in PostmasterMain (argc=3, argv=0x2d4e660) at postmaster.c:1361
#18 0x0000000000728523 in main (argc=3, argv=0x2d4e660) at main.c:228







--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to