Re: EXPLAIN: showing ReadStream / prefetch stats

Tomas Vondra Mon, 16 Mar 2026 17:15:49 -0700

On 3/16/26 19:26, Tomas Vondra wrote:
> On 3/16/26 18:29, Melanie Plageman wrote:
>> On Mon, Mar 16, 2026 at 3:08 AM Lukas Fittl <[email protected]> wrote:
> ...
> 
>>> I feel like something is off about the complexity of having each node
>>> type ferry back the information. e.g. when you're implementing the
>>> support for index prefetching, it'll require a bunch more changes. In
>>> my mind, there is a reason we have a related problem that we solved
>>> with the current pgBufferUsage, instead of dealing with that on a
>>> per-node basis. I really feel we should have a more generic way of
>>> dealing with this.
>> <--snip-->
>>> I've attached a prototype of how that could look like (apply the other
>>> patch set first, v8, see commit fest entry [1] - also attached a
>>> preparatory refactoring of using "Instrumentation" for parallel query
>>> reporting, which avoids having individual structs there).
>>


It looks reasonable, and it's nice to not have to worry about passing
the data through the TAM interface etc. Having to do stuff like this:

-       if (instr->instr.need_bufusage || instr->instr.need_walusage)
+       if (instr->instr.need_bufusage || instr->instr.need_walusage ||
instr->instr.need_iousage)

in so many places is not great. It'd really benefit from some macro that
says "if any of the pieces is needed ...".

>> The patch footprint is _much_ nicer with your stack-based
>> instrumentation. Very cool. I'll leave it to Tomas whether he wants to
>> create a dependency on your big project a few weeks before feature
>> freeze, though.
>>
> 
> I don't know. I have not looked at the stack-based instrumentation yet,
> so I have no idea how complex or how close to committable it is.
> 

I've started looking at the stack tracking patch only today, and it
looks in pretty good shape. I have no clear idea how close to being
committed it is, so I'm hesitant to make this patch depend on it.

AFAICS the stack-based tracking is meant for resources with just one
instance. E.g. there's only one WAL, or one buffer pool, so when a node
says "update WAL usage" or "increment buffer hit/read", there's one
counter to update. But will that necessarily apply for read streams?

AFAIK existing scans use only a single read stream, but that's only
because none of the built-in index AMs uses a read stream. If we get
index prefetching in PG19, or if any index AM (e.g. pgvector) chooses to
use a read stream internally, how will that work? Will we "merge" the
stats from all read streams used by the node, or what will happen?

I realize this is a similar issue the 0007 part of the stack-based
tracing patch deals with - tracking buffers for index/table separately.
And it does that by making nodeIndexscan.c quite a bit more complex by
expanding index_getnext_slot(). Doesn't seem great, but not sure. I
suppose nodeIndexonlyscan.c will have to do something similar.


regards

-- 
Tomas Vondra

Re: EXPLAIN: showing ReadStream / prefetch stats

Reply via email to