On Fri, Mar 27, 2026 at 10:37 PM Masahiko Sawada <[email protected]> wrote:
>
> On Fri, Mar 27, 2026 at 3:58 PM Tomas Vondra <[email protected]> wrote:
> > One thing we lose by doing ad hoc aggregation (instead of just relying
> > on the regular SQL aggregation operators) is lack of memory limit.
> > There's a simple in-memory hash table, no spilling to disk etc. The
> > simple pg_buffercache view does not have this issue, because the
> > tuplestore will spill to disk after hitting work_mem. Simplehash won't.
> >
> > The entries are ~48B, so there would need to be buffers for ~100k
> > (relfilenode,forknum) combinations to overflow 4MB. It's not very
> > common, but I've seen systems with more relations that this. Would be
> > good to show some numbers showing it's not an issue.
>
> Good point. I agree that we should not introduce the function in a way
> that there is a risk of using excessive memory while not respecting
> work_mem or other GUC parameters.

Yeah, I agree that is problematic regarding work_mem.

FWIW, I could see two methods to address that specifically, if we
wanted the special purpose function:

1) Error out if our hash table grows too large and require the user to
increase work_mem to get the data - seems inconvenient, but might be
okay if we are typically below work_mem limit anyway (I haven't run
the numbers on that yet)

2) Implement disk spill logic using a LogicalTapeSet or similar - I
think that'd be substantially more code, doesn't seem worth it just
for this (but if a situation like this recurs, we could consider a
more generalized facility)

Thanks,
Lukas

-- 
Lukas Fittl


Reply via email to