On Fri, Mar 27, 2026 at 10:37 PM Masahiko Sawada <[email protected]> wrote: > > On Fri, Mar 27, 2026 at 3:58 PM Tomas Vondra <[email protected]> wrote: > > One thing we lose by doing ad hoc aggregation (instead of just relying > > on the regular SQL aggregation operators) is lack of memory limit. > > There's a simple in-memory hash table, no spilling to disk etc. The > > simple pg_buffercache view does not have this issue, because the > > tuplestore will spill to disk after hitting work_mem. Simplehash won't. > > > > The entries are ~48B, so there would need to be buffers for ~100k > > (relfilenode,forknum) combinations to overflow 4MB. It's not very > > common, but I've seen systems with more relations that this. Would be > > good to show some numbers showing it's not an issue. > > Good point. I agree that we should not introduce the function in a way > that there is a risk of using excessive memory while not respecting > work_mem or other GUC parameters.
Yeah, I agree that is problematic regarding work_mem. FWIW, I could see two methods to address that specifically, if we wanted the special purpose function: 1) Error out if our hash table grows too large and require the user to increase work_mem to get the data - seems inconvenient, but might be okay if we are typically below work_mem limit anyway (I haven't run the numbers on that yet) 2) Implement disk spill logic using a LogicalTapeSet or similar - I think that'd be substantially more code, doesn't seem worth it just for this (but if a situation like this recurs, we could consider a more generalized facility) Thanks, Lukas -- Lukas Fittl
