Alexander Korotkov <aekorot...@gmail.com> writes: > I think this indicates unfinished intention to wait for checkpoint > completion. But I think both cases (checkpoint finished and > unfinished) should work correctly. So, I believe there is a backend > problem. I'm trying to reproduce this locally. Sorry for the > confusion.
I've found it not hugely difficult to reproduce: on two different Mac M4 machines, repeated runs of 046_checkpoint_logical_slot.pl hung up after 30 or so tries. Backtrace of the stuck query looks like (lldb) bt * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP * frame #0: 0x00000001836b01c8 libsystem_kernel.dylib`__semwait_signal + 8 frame #1: 0x000000018358c6f4 libsystem_c.dylib`nanosleep + 220 frame #2: 0x000000010070d358 postgres`pg_usleep(microsec=1000) at pgsleep.c:50:10 [opt] frame #3: 0x00000001002a0508 postgres`read_local_xlog_page_guts(state=0x0000000126057298, targetPagePtr=687882240, reqLen=277, targetRecPtr=<unavailable>, cur_page="\U00000018\xd1\U00000005", wait_for_wal=true) at xlogutils.c:950:4 [opt] frame #4: 0x0000000100298f0c postgres`ReadPageInternal(state=0x0000000126057298, pageptr=687882240, reqLen=277) at xlogreader.c:1061:12 [opt] frame #5: 0x0000000100298300 postgres`XLogReadAhead [inlined] XLogDecodeNextRecord(state=0x0000000126057298, nonblocking=<unavailable>) at xlogreader.c:727:14 [opt] frame #6: 0x00000001002980b4 postgres`XLogReadAhead(state=0x0000000126057298, nonblocking=false) at xlogreader.c:974:11 [opt] frame #7: 0x0000000100297fd4 postgres`XLogReadRecord(state=0x0000000126057298, errormsg=0x000000016fc24d08) at xlogreader.c:405:3 [opt] frame #8: 0x00000001004ee514 postgres`pg_logical_slot_get_changes_guts(fcinfo=<unavailable>, confirm=true, binary=<unavailable>) at logicalfuncs.c:257:13 [opt] frame #9: 0x00000001004ee1e8 postgres`pg_logical_slot_get_changes(fcinfo=<unavailable>) at logicalfuncs.c:333:9 [opt] frame #10: 0x00000001003a77ec postgres`ExecMakeTableFunctionResult(setexpr=0x000000012603da40, econtext=<unavailable>, argContext=<unavailable>, expectedDesc=0x00000001260472b0, randomAccess=false) at execSRF.c:234:13 [opt] frame #11: 0x00000001003ba02c postgres`FunctionNext(node=0x000000012603d6d0) at nodeFunctionscan.c:94:5 [opt] frame #12: 0x00000001003b4894 postgres`fetch_input_tuple [inlined] ExecProcNode(node=0x000000012603d6d0) at executor.h:313:9 [opt] frame #13: 0x00000001003b4878 postgres`fetch_input_tuple(aggstate=0x000000012603cf28) at nodeAgg.c:563:10 [opt] frame #14: 0x00000001003b2eb4 postgres`ExecAgg [inlined] agg_retrieve_direct(aggstate=0x000000012603cf28) at nodeAgg.c:2450:17 [opt] frame #15: 0x00000001003b2ab8 postgres`ExecAgg(pstate=0x000000012603cf28) at nodeAgg.c:2265:14 [opt] so apparently it's waiting patiently for WAL that will never appear. (I do see the 'checkpoint complete' log message before the shutdown in this particular run.) I still have this debugger session open if there's any particular data you'd like me to look at. regards, tom lane