On Wed, Jan 11, 2023 4:31 PM Melih Mutlu <m.melihmu...@gmail.com> wrote:
> 
> Hi hackers,
> 
> Rebased the patch to resolve conflicts.
> 

Thanks for your patch. Here are some comments.

0001 patch
============
1. walsender.c
+       /* Create a tuple to send consisten WAL location */

"consisten" should be "consistent" I think.

2. logical.c
+       if (need_full_snapshot)
+       {
+               LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
+
+               SpinLockAcquire(&slot->mutex);
+               slot->effective_catalog_xmin = xmin_horizon;
+               slot->data.catalog_xmin = xmin_horizon;
+               slot->effective_xmin = xmin_horizon;
+               SpinLockRelease(&slot->mutex);
+
+               xmin_horizon = 
GetOldestSafeDecodingTransactionId(!need_full_snapshot);
+               ReplicationSlotsComputeRequiredXmin(true);
+
+               LWLockRelease(ProcArrayLock);
+       }

It seems that we should first get the safe decoding xid, then inform the slot
machinery about the new limit, right? Otherwise the limit will be
InvalidTransactionId and that seems inconsistent with the comment.

3. doc/src/sgml/protocol.sgml
+       is used in the currenct transaction. This command is currently only 
supported
+       for logical replication.
+       slots.

We don't need to put "slots" in a new line.


0002 patch
============
1.
In pg_subscription_rel.h, I think the type of "srrelslotname" can be changed to
NameData, see "subslotname" in pg_subscription.h.

2.
+                                * Find the logical replication sync worker if 
exists store
+                                * the slot number for dropping associated 
replication slots
+                                * later.

Should we add comma after "if exists"?

3.
+       PG_FINALLY();
+       {
+               pfree(cmd.data);
+       }
+       PG_END_TRY();
+       \
+               return tablelist;
+}

Do we need the backslash?

4.
+       /*
+        * Advance to the LSN got from walrcv_create_slot. This is WAL
+        * logged for the purpose of recovery. Locks are to prevent the
+        * replication origin from vanishing while advancing.

"walrcv_create_slot" should be changed to
"walrcv_create_slot/walrcv_slot_snapshot" I think.

5.
+                       /* Replication drop might still exist. Try to drop */
+                       replorigin_drop_by_name(originname, true, false);

Should "Replication drop" be "Replication origin"?

6.
I saw an assertion failure in the following case, could you please look into it?
The backtrace is attached.

-- pub
CREATE TABLE tbl1 (a int, b text);
CREATE TABLE tbl2 (a int primary key, b text);
create publication pub for table tbl1, tbl2;
insert into tbl1 values (1, 'a');
insert into tbl1 values (1, 'a');

-- sub
CREATE TABLE tbl1 (a int primary key, b text);
CREATE TABLE tbl2 (a int primary key, b text);
create subscription sub connection 'dbname=postgres port=5432' publication pub;

Subscriber log:
2023-01-17 14:47:10.054 CST [1980841] LOG:  logical replication apply worker 
for subscription "sub" has started
2023-01-17 14:47:10.060 CST [1980843] LOG:  logical replication table 
synchronization worker for subscription "sub", table "tbl1" has started
2023-01-17 14:47:10.070 CST [1980845] LOG:  logical replication table 
synchronization worker for subscription "sub", table "tbl2" has started
2023-01-17 14:47:10.073 CST [1980843] ERROR:  duplicate key value violates 
unique constraint "tbl1_pkey"
2023-01-17 14:47:10.073 CST [1980843] DETAIL:  Key (a)=(1) already exists.
2023-01-17 14:47:10.073 CST [1980843] CONTEXT:  COPY tbl1, line 2
2023-01-17 14:47:10.074 CST [1980821] LOG:  background worker "logical 
replication worker" (PID 1980843) exited with exit code 1
2023-01-17 14:47:10.083 CST [1980845] LOG:  logical replication table 
synchronization worker for subscription "sub", table "tbl2" has finished
2023-01-17 14:47:10.083 CST [1980845] LOG:  logical replication table 
synchronization worker for subscription "sub" has moved to sync table "tbl1".
TRAP: failed Assert("node != InvalidRepOriginId"), File: "origin.c", Line: 892, 
PID: 1980845

Regards,
Shi yu
#0  0x00007f38bb78baff in raise () from /lib64/libc.so.6
#1  0x00007f38bb75eea5 in abort () from /lib64/libc.so.6
#2  0x0000000000b44501 in ExceptionalCondition (conditionName=0xd0deac "node != 
InvalidRepOriginId", fileName=0xd0da4d "origin.c", lineNumber=892) at 
assert.c:66
#3  0x00000000008f93f5 in replorigin_advance (node=0, remote_commit=22144256, 
local_commit=0, go_backward=true, wal_log=true) at origin.c:892
#4  0x000000000090cf2f in LogicalRepSyncTableStart 
(origin_startpos=0x7ffcbe54e808) at tablesync.c:1641
#5  0x0000000000913ee4 in start_table_sync (origin_startpos=0x7ffcbe54e808, 
myslotname=0x7ffcbe54e790) at worker.c:4419
#6  0x00000000009140be in run_tablesync_worker (options=0x7ffcbe54e7c0, 
slotname=0x0, originname=0x7ffcbe54e810 "pg_16398_3", originname_size=64, 
origin_startpos=0x7ffcbe54e808)
    at worker.c:4497
#7  0x00000000009147c5 in ApplyWorkerMain (main_arg=2) at worker.c:4749
#8  0x00000000008c899d in StartBackgroundWorker () at bgworker.c:861
#9  0x00000000008d2596 in do_start_bgworker (rw=0x1f2bb20) at postmaster.c:5746
#10 0x00000000008d2942 in maybe_start_bgworkers () at postmaster.c:5970
#11 0x00000000008d19f7 in process_pm_pmsignal () at postmaster.c:5133
#12 0x00000000008cdb06 in ServerLoop () at postmaster.c:1753
#13 0x00000000008cd4da in PostmasterMain (argc=3, argv=0x1ef5710) at 
postmaster.c:1452
#14 0x00000000007a9438 in main (argc=3, argv=0x1ef5710) at main.c:200

Reply via email to