On Fri, 23 Aug 2024 at 01:29, Michael Paquier <mich...@paquier.xyz> wrote:
> On Thu, Aug 22, 2024 at 10:36:38AM -0400, Alvaro Herrera wrote: > > On 2024-Aug-22, Michael Paquier wrote: > >> I'm not sure that we need to get down to that until somebody has a > >> case where they want to rely on stats of injection points for their > >> stuff. At this stage, I only want the stats to be enabled to provide > >> automated checks for the custom pgstats APIs, so disabling it by > >> default and enabling it only in the stats test of the module > >> injection_points sounds kind of enough to me for now. > > > > Oh! I thought the stats were useful by themselves. > > Yep, currently they're not, but I don't want to discard that they'll > never be, either. Perhaps there would be a case where somebody would > like to run a callback N times and trigger a condition? That's > something where the stats could be useful, but I don't have a specific > case for that now. I'm just imagining possibilities. > I believe I am seeing the problem being discussed occuring on a production system running 15.6, causing ever-increasing replay lag on the standby, until I cancel the offending process on the standby and force it to process its interrupts. Here's the backtrace before I do that: #0 0x00007f4503b81876 in select () from /lib64/libc.so.6 #1 0x0000558b0956891a in pg_usleep (microsec=microsec@entry=1000) at pgsleep.c:56 #2 0x0000558b0917e01a in GetMultiXactIdMembers (from_pgupgrade=false, onlyLock=<optimized out>, members=0x7ffcd2a9f1e0, multi=109187502) at multixact.c:1392 #3 GetMultiXactIdMembers (multi=109187502, members=members@entry=0x7ffcd2a9f1e0, from_pgupgrade=from_pgupgrade@entry=false, onlyLock=onlyLock@entry=false) at multixact.c:1224 #4 0x0000558b0913de15 in MultiXactIdGetUpdateXid (xmax=<optimized out>, t_infomask=<optimized out>) at heapam.c:6924 #5 0x0000558b09146028 in HeapTupleGetUpdateXid (tuple=tuple@entry=0x7f440d428308) at heapam.c:6965 #6 0x0000558b0914c02f in HeapTupleSatisfiesMVCC (htup=0x558b0b7cbf20, htup=0x558b0b7cbf20, buffer=8053429, snapshot=0x558b0b63a2d8) at heapam_visibility.c:1089 #7 HeapTupleSatisfiesVisibility (tup=tup@entry=0x7ffcd2a9f2b0, snapshot=snapshot@entry=0x558b0b63a2d8, buffer=buffer@entry=8053429) at heapam_visibility.c:1771 #8 0x0000558b0913e819 in heapgetpage (sscan=sscan@entry=0x558b0b7ccfa0, page=page@entry=115) at heapam.c:468 #9 0x0000558b0913eb7e in heapgettup_pagemode (scan=scan@entry=0x558b0b7ccfa0, dir=ForwardScanDirection, nkeys=0, key=0x0) at heapam.c:1120 #10 0x0000558b0913fb5e in heap_getnextslot (sscan=0x558b0b7ccfa0, direction=<optimized out>, slot=0x558b0b7cc000) at heapam.c:1352 #11 0x0000558b092c0e7a in table_scan_getnextslot (slot=0x558b0b7cc000, direction=ForwardScanDirection, sscan=<optimized out>) at ../../../src/include/access/tableam.h:1046 #12 SeqNext (node=0x558b0b7cbe10) at nodeSeqscan.c:80 #13 0x0000558b0929b9bf in ExecScan (node=0x558b0b7cbe10, accessMtd=0x558b092c0df0 <SeqNext>, recheckMtd=0x558b092c0dc0 <SeqRecheck>) at execScan.c:198 #14 0x0000558b09292cb2 in ExecProcNode (node=0x558b0b7cbe10) at ../../../src/include/executor/executor.h:262 #15 ExecutePlan (execute_once=<optimized out>, dest=0x558b0bca1350, direction=<optimized out>, numberTuples=0, sendTuples=<optimized out>, operation=CMD_SELECT, use_parallel_mode=<optimized out>, planstate=0x558b0b7cbe10, estate=0x558b0b7cbbe8) at execMain.c:1636 #16 standard_ExecutorRun (queryDesc=0x558b0b8c9798, direction=<optimized out>, count=0, execute_once=<optimized out>) at execMain.c:363 #17 0x00007f44f64d43c5 in pgss_ExecutorRun (queryDesc=0x558b0b8c9798, direction=ForwardScanDirection, count=0, execute_once=<optimized out>) at pg_stat_statements.c:1010 #18 0x0000558b093fda0f in PortalRunSelect (portal=portal@entry=0x558b0b6ba458, forward=forward@entry=true, count=0, count@entry=9223372036854775807, dest=dest@entry=0x558b0bca1350) at pquery.c:924 #19 0x0000558b093fedb8 in PortalRun (portal=portal@entry=0x558b0b6ba458, count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=true, run_once=run_once@entry=true, dest=dest@entry=0x558b0bca1350, altdest=altdest@entry=0x558b0bca1350, qc=0x7ffcd2a9f7a0) at pquery.c:768 #20 0x0000558b093fb243 in exec_simple_query ( query_string=0x558b0b6170c8 "<redacted>") at postgres.c:1250 #21 0x0000558b093fd412 in PostgresMain (dbname=<optimized out>, username=<optimized out>) at postgres.c:4598 #22 0x0000558b0937e170 in BackendRun (port=<optimized out>, port=<optimized out>) at postmaster.c:4514 #23 BackendStartup (port=<optimized out>) at postmaster.c:4242 #24 ServerLoop () at postmaster.c:1809 #25 0x0000558b0937f147 in PostmasterMain (argc=argc@entry=5, argv=argv@entry=0x558b0b5d0a30) at postmaster.c:1481 #26 0x0000558b09100a2c in main (argc=5, argv=0x558b0b5d0a30) at main.c:202 This occurred twice, meaning 2 processes needed terminating. Thom