Re: Spilling hashed SetOps and aggregates to disk

Tomas Vondra Tue, 05 Jun 2018 06:25:51 -0700

On 06/05/2018 03:17 PM, David Rowley wrote:

On 6 June 2018 at 01:09, Andres Freund <and...@anarazel.de> wrote:

On 2018-06-06 01:06:39 +1200, David Rowley wrote:

My concern is that only accounting memory for the group and not the
state is only solving half the problem. It might be fine for
aggregates that don't stray far from their aggtransspace, but for the
other ones, we could still see OOM.

If solving the problem completely is too hard, then a half fix (maybe
3/4) is better than nothing, but if we can get a design for a full fix
before too much work is done, then isn't that better?


I don't think we actually disagree.  I was really primarily talking
about the case where we can't really do better because we don't have
serialization support.  I mean we could just rescan from scratch, using
a groupagg, but that obviously sucks.


I don't think we do. To take yours to the 100% solution might just
take adding the memory accounting to palloc that Jeff proposed a few
years ago and use that accounting to decide when we should switch
method.

However, I don't quite fully recall how the patch accounted for
memory consumed by sub-contexts and if getting the entire
consumption required recursively looking at subcontexts. If that's
the case then checking the consumption would likely cost too much if
it was done after each tuple was aggregated.

Yeah, a simple wrapper would not really fix the issue, becauseallocating memory in the subcontext would not update the accountinginformation. So we'd be oblivious of possibly large amounts of memory. Idon't quite see how is this related to (not) having serializationsupport, as it's more about not knowing we already hit the memory limit.

That being said, I don't think array_agg actually has the issue, atleast since b419865a814abbca12bdd6eef6a3d5ed67f432e1 (it does behavedifferently in aggregate and non-aggregate contexts, IIRC).

I don't know how common this issue is, though. I don't think anybuilt-in aggregates create additional contexts, but I may be wrong. Butwe have this damn extensibility thing, where users can write their ownaggregates ...


regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Spilling hashed SetOps and aggregates to disk

Reply via email to