On Fri, 23 Aug 2024 at 01:29, Michael Paquier <mich...@paquier.xyz> wrote:

> On Thu, Aug 22, 2024 at 10:36:38AM -0400, Alvaro Herrera wrote:
> > On 2024-Aug-22, Michael Paquier wrote:
> >> I'm not sure that we need to get down to that until somebody has a
> >> case where they want to rely on stats of injection points for their
> >> stuff.  At this stage, I only want the stats to be enabled to provide
> >> automated checks for the custom pgstats APIs, so disabling it by
> >> default and enabling it only in the stats test of the module
> >> injection_points sounds kind of enough to me for now.
> >
> > Oh!  I thought the stats were useful by themselves.
>
> Yep, currently they're not, but I don't want to discard that they'll
> never be, either.  Perhaps there would be a case where somebody would
> like to run a callback N times and trigger a condition?  That's
> something where the stats could be useful, but I don't have a specific
> case for that now.  I'm just imagining possibilities.
>

I believe I am seeing the problem being discussed occuring on a production
system running 15.6, causing ever-increasing replay lag on the standby,
until I cancel the offending process on the standby and force it to process
its interrupts.

Here's the backtrace before I do that:

#0  0x00007f4503b81876 in select () from /lib64/libc.so.6
#1  0x0000558b0956891a in pg_usleep (microsec=microsec@entry=1000) at
pgsleep.c:56
#2  0x0000558b0917e01a in GetMultiXactIdMembers (from_pgupgrade=false,
onlyLock=<optimized out>, members=0x7ffcd2a9f1e0, multi=109187502) at
multixact.c:1392
#3  GetMultiXactIdMembers (multi=109187502,
members=members@entry=0x7ffcd2a9f1e0,
from_pgupgrade=from_pgupgrade@entry=false, onlyLock=onlyLock@entry=false)
at multixact.c:1224
#4  0x0000558b0913de15 in MultiXactIdGetUpdateXid (xmax=<optimized out>,
t_infomask=<optimized out>) at heapam.c:6924
#5  0x0000558b09146028 in HeapTupleGetUpdateXid
(tuple=tuple@entry=0x7f440d428308)
at heapam.c:6965
#6  0x0000558b0914c02f in HeapTupleSatisfiesMVCC (htup=0x558b0b7cbf20,
htup=0x558b0b7cbf20, buffer=8053429, snapshot=0x558b0b63a2d8) at
heapam_visibility.c:1089
#7  HeapTupleSatisfiesVisibility (tup=tup@entry=0x7ffcd2a9f2b0,
snapshot=snapshot@entry=0x558b0b63a2d8, buffer=buffer@entry=8053429) at
heapam_visibility.c:1771
#8  0x0000558b0913e819 in heapgetpage (sscan=sscan@entry=0x558b0b7ccfa0,
page=page@entry=115) at heapam.c:468
#9  0x0000558b0913eb7e in heapgettup_pagemode (scan=scan@entry=0x558b0b7ccfa0,
dir=ForwardScanDirection, nkeys=0, key=0x0) at heapam.c:1120
#10 0x0000558b0913fb5e in heap_getnextslot (sscan=0x558b0b7ccfa0,
direction=<optimized out>, slot=0x558b0b7cc000) at heapam.c:1352
#11 0x0000558b092c0e7a in table_scan_getnextslot (slot=0x558b0b7cc000,
direction=ForwardScanDirection, sscan=<optimized out>) at
../../../src/include/access/tableam.h:1046
#12 SeqNext (node=0x558b0b7cbe10) at nodeSeqscan.c:80
#13 0x0000558b0929b9bf in ExecScan (node=0x558b0b7cbe10,
accessMtd=0x558b092c0df0 <SeqNext>, recheckMtd=0x558b092c0dc0 <SeqRecheck>)
at execScan.c:198
#14 0x0000558b09292cb2 in ExecProcNode (node=0x558b0b7cbe10) at
../../../src/include/executor/executor.h:262
#15 ExecutePlan (execute_once=<optimized out>, dest=0x558b0bca1350,
direction=<optimized out>, numberTuples=0, sendTuples=<optimized out>,
operation=CMD_SELECT, use_parallel_mode=<optimized out>,
planstate=0x558b0b7cbe10, estate=0x558b0b7cbbe8)
    at execMain.c:1636
#16 standard_ExecutorRun (queryDesc=0x558b0b8c9798, direction=<optimized
out>, count=0, execute_once=<optimized out>) at execMain.c:363
#17 0x00007f44f64d43c5 in pgss_ExecutorRun (queryDesc=0x558b0b8c9798,
direction=ForwardScanDirection, count=0, execute_once=<optimized out>) at
pg_stat_statements.c:1010
#18 0x0000558b093fda0f in PortalRunSelect (portal=portal@entry=0x558b0b6ba458,
forward=forward@entry=true, count=0, count@entry=9223372036854775807,
dest=dest@entry=0x558b0bca1350) at pquery.c:924
#19 0x0000558b093fedb8 in PortalRun (portal=portal@entry=0x558b0b6ba458,
count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=true,
run_once=run_once@entry=true, dest=dest@entry=0x558b0bca1350,
    altdest=altdest@entry=0x558b0bca1350, qc=0x7ffcd2a9f7a0) at pquery.c:768
#20 0x0000558b093fb243 in exec_simple_query (
    query_string=0x558b0b6170c8 "<redacted>")
    at postgres.c:1250
#21 0x0000558b093fd412 in PostgresMain (dbname=<optimized out>,
username=<optimized out>) at postgres.c:4598
#22 0x0000558b0937e170 in BackendRun (port=<optimized out>, port=<optimized
out>) at postmaster.c:4514
#23 BackendStartup (port=<optimized out>) at postmaster.c:4242
#24 ServerLoop () at postmaster.c:1809
#25 0x0000558b0937f147 in PostmasterMain (argc=argc@entry=5,
argv=argv@entry=0x558b0b5d0a30)
at postmaster.c:1481
#26 0x0000558b09100a2c in main (argc=5, argv=0x558b0b5d0a30) at main.c:202

This occurred twice, meaning 2 processes needed terminating.

Thom

Reply via email to