On Tue, Apr 7, 2026 at 9:25 PM Heikki Linnakangas <[email protected]> wrote: > > On 07/04/2026 16:47, Andres Freund wrote: > > On 2026-04-07 16:07:45 +0300, Heikki Linnakangas wrote: > >> On 28/03/2026 06:18, Ashutosh Bapat wrote: > >>> Parallely myself and Palak Chaturvedi developed a quick patch to > >>> modernise pg_buffercache_pages() and use tuplestore so that it doesn't > >>> have to rely on NBuffers being the same between start of the scan, > >>> when memory allocated, when the scan ends - a condition possible with > >>> resizing buffer cache. It seems to improve the timings by about 10-30% > >>> on my laptop for 128MB buffercache size. Without this patch the time > >>> taken to execute Lukas's query varies between 10-15ms on my laptop. > >>> With this patch it varies between 8-9ms. So the timing is more stable > >>> as a side effect. It's not a 10x improvement that we are looking for > >>> but it looks like a step in the right direction. That improvement > >>> seems to come purely because we avoid creating a heap tuple. I wonder > >>> if there are some places up in the execution tree where full > >>> heaptuples get formed again instead of continuing to use minimal > >>> tuples or places where we perform some extra actions that are not > >>> required. > > > > I don't think that's the reason for the improvement - tuplestore_putvalues() > > forms a minimal tuple, and the cost to form a minimal tuple and a heap tuple > > aren't meaningfully different. > > Yeah, I wasn't fully convinced of that part either, which is why I left > it out of the commit message. I mostly wanted to get rid of the > double-buffering where we first accumulated all the data in an array. >
Just because of the name of the function, I thought tuplestore_puttuple stores virtual tuple. But looking at the function it's clearly using minimal tuples. Sorry for my misunderstanding. > > I think the problem is that we materialize rowmode SRFs as a tuplestore if > > they are in the from list. You can easily see this even with just > > generate_series(): > > > > postgres[1520825][1]=# SELECT count(*) FROM generate_series(1, 1000000); > > ┌─────────┐ > > │ count │ > > ├─────────┤ > > │ 1000000 │ > > └─────────┘ > > (1 row) > > > > Time: 117.939 ms > > postgres[1520825][1]=# SELECT count(*) FROM (SELECT generate_series(1, > > 1000000)); > > ┌─────────┐ > > │ count │ > > ├─────────┤ > > │ 1000000 │ > > └─────────┘ > > (1 row) > > > > Time: 58.914 ms > > Oh, to be honest I didn't remember that we *don't* materialize when it's > in the target list. > IIUC, query using SRF in targetlist runs faster because it does "not" use tuplestore. I consistently saw 10%-30% performance improvement after using tuplestore in pg_buffercache_pages(). Is that purely because of avoiding an in-memory array? -- Best Wishes, Ashutosh Bapat
