On 3/16/26 19:26, Tomas Vondra wrote: > On 3/16/26 18:29, Melanie Plageman wrote: >> On Mon, Mar 16, 2026 at 3:08 AM Lukas Fittl <[email protected]> wrote: > ... > >>> I feel like something is off about the complexity of having each node >>> type ferry back the information. e.g. when you're implementing the >>> support for index prefetching, it'll require a bunch more changes. In >>> my mind, there is a reason we have a related problem that we solved >>> with the current pgBufferUsage, instead of dealing with that on a >>> per-node basis. I really feel we should have a more generic way of >>> dealing with this. >> <--snip--> >>> I've attached a prototype of how that could look like (apply the other >>> patch set first, v8, see commit fest entry [1] - also attached a >>> preparatory refactoring of using "Instrumentation" for parallel query >>> reporting, which avoids having individual structs there). >>
It looks reasonable, and it's nice to not have to worry about passing the data through the TAM interface etc. Having to do stuff like this: - if (instr->instr.need_bufusage || instr->instr.need_walusage) + if (instr->instr.need_bufusage || instr->instr.need_walusage || instr->instr.need_iousage) in so many places is not great. It'd really benefit from some macro that says "if any of the pieces is needed ...". >> The patch footprint is _much_ nicer with your stack-based >> instrumentation. Very cool. I'll leave it to Tomas whether he wants to >> create a dependency on your big project a few weeks before feature >> freeze, though. >> > > I don't know. I have not looked at the stack-based instrumentation yet, > so I have no idea how complex or how close to committable it is. > I've started looking at the stack tracking patch only today, and it looks in pretty good shape. I have no clear idea how close to being committed it is, so I'm hesitant to make this patch depend on it. AFAICS the stack-based tracking is meant for resources with just one instance. E.g. there's only one WAL, or one buffer pool, so when a node says "update WAL usage" or "increment buffer hit/read", there's one counter to update. But will that necessarily apply for read streams? AFAIK existing scans use only a single read stream, but that's only because none of the built-in index AMs uses a read stream. If we get index prefetching in PG19, or if any index AM (e.g. pgvector) chooses to use a read stream internally, how will that work? Will we "merge" the stats from all read streams used by the node, or what will happen? I realize this is a similar issue the 0007 part of the stack-based tracing patch deals with - tracking buffers for index/table separately. And it does that by making nodeIndexscan.c quite a bit more complex by expanding index_getnext_slot(). Doesn't seem great, but not sure. I suppose nodeIndexonlyscan.c will have to do something similar. regards -- Tomas Vondra
