> On 11 Aug 2016, at 17:43, Petr Jelinek <[email protected]> wrote:
>
>>
>> * Also I wasn’t able actually to run replication itself =) While regression
>> tests passes, TAP
>> tests and manual run stuck. pg_subscription_rel.substate never becomes ‘r’.
>> I’ll investigate
>> that more and write again.
>
> Interesting, please keep me posted. It's possible for tables to stay in 's'
> state for some time if there is nothing happening on the server, but that
> should not mean anything is stuck.
Slightly played around, it seems that apply worker waits forever for substate
change.
(lldb) bt
* thread #1: tid = 0x183e00, 0x00007fff88c7f2a2 libsystem_kernel.dylib`poll +
10, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
frame #0: 0x00007fff88c7f2a2 libsystem_kernel.dylib`poll + 10
frame #1: 0x00000001017ca8a3
postgres`WaitEventSetWaitBlock(set=0x00007fd2dc816b30, cur_timeout=10000,
occurred_events=0x00007fff5e7f67d8, nevents=1) + 51 at latch.c:1108
frame #2: 0x00000001017ca438
postgres`WaitEventSetWait(set=0x00007fd2dc816b30, timeout=10000,
occurred_events=0x00007fff5e7f67d8, nevents=1) + 248 at latch.c:941
frame #3: 0x00000001017c9fde
postgres`WaitLatchOrSocket(latch=0x000000010ab208a4, wakeEvents=25, sock=-1,
timeout=10000) + 254 at latch.c:347
frame #4: 0x00000001017c9eda postgres`WaitLatch(latch=0x000000010ab208a4,
wakeEvents=25, timeout=10000) + 42 at latch.c:302
* frame #5: 0x0000000101793352
postgres`wait_for_sync_status_change(tstate=0x0000000101e409b0) + 178 at
tablesync.c:228
frame #6: 0x0000000101792bbe
postgres`process_syncing_tables_apply(slotname="subbi",
end_lsn=140734778796592) + 430 at tablesync.c:436
frame #7: 0x00000001017928c1
postgres`process_syncing_tables(slotname="subbi", end_lsn=140734778796592) + 81
at tablesync.c:518
frame #8: 0x000000010177b620
postgres`LogicalRepApplyLoop(last_received=140734778796592) + 704 at
apply.c:1122
frame #9: 0x000000010177bef4 postgres`ApplyWorkerMain(main_arg=0) + 1044 at
apply.c:1353
frame #10: 0x000000010174cb5a postgres`StartBackgroundWorker + 826 at
bgworker.c:729
frame #11: 0x0000000101762227
postgres`do_start_bgworker(rw=0x00007fd2db700000) + 343 at postmaster.c:5553
frame #12: 0x000000010175d42b postgres`maybe_start_bgworker + 427 at
postmaster.c:5761
frame #13: 0x000000010175bccf
postgres`sigusr1_handler(postgres_signal_arg=30) + 383 at postmaster.c:4979
frame #14: 0x00007fff9ab2352a libsystem_platform.dylib`_sigtramp + 26
frame #15: 0x00007fff88c7e07b libsystem_kernel.dylib`__select + 11
frame #16: 0x000000010175d5ac postgres`ServerLoop + 252 at postmaster.c:1665
frame #17: 0x000000010175b2e0 postgres`PostmasterMain(argc=3,
argv=0x00007fd2db403840) + 5968 at postmaster.c:1309
frame #18: 0x000000010169507f postgres`main(argc=3,
argv=0x00007fd2db403840) + 751 at main.c:228
frame #19: 0x00007fff8d45c5ad libdyld.dylib`start + 1
(lldb) p state
(char) $1 = 'c'
(lldb) p tstate->state
(char) $2 = ‘c’
Also I’ve noted that some lsn position looks wrong on publisher:
postgres=# select restart_lsn, confirmed_flush_lsn from pg_replication_slots;
restart_lsn | confirmed_flush_lsn
-------------+---------------------
0/1530EF8 | 7FFF/5E7F6A30
(1 row)
postgres=# select sent_location, write_location, flush_location,
replay_location from pg_stat_replication;
sent_location | write_location | flush_location | replay_location
---------------+----------------+----------------+-----------------
0/1530F30 | 7FFF/5E7F6A30 | 7FFF/5E7F6A30 | 7FFF/5E7F6A30
(1 row)
--
Stas Kelvich
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers