Re: [HACKERS] PATCH: decreasing memory needlessly consumed by array_agg

Tom Lane Wed, 28 Jan 2015 12:29:47 -0800

Jeff Davis <[email protected]> writes:
> On Tue, Jan 20, 2015 at 6:44 AM, Tom Lane <[email protected]> wrote:
>> Uh, sorry, I've not been paying any attention to this thread for awhile.
>> What's the remaining questions at issue?


> This patch is trying to improve the array_agg case where there are
> many arrays being constructed simultaneously, such as in HashAgg. You
> strongly suggested that a commitable patch would differentiate between
> the grouped and ungrouped cases (or perhaps you meant differentiate
> between HashAgg and sorted aggregation?). Tomas's current patch does
> not do so; it does two main things:
>    1. Uses a common memory context for arrays being constructed by
> array_agg in any context (ungrouped, sorted agg, and HashAgg)
>    2. Reduces the initial array allocation to 8 elements from 64, but
> preserves doubling behavior

Sorry for slow response on this.  I took a look at the v8 patch (that's
still latest no?).  I see that it only gets rid of the separate context
for the two callers array_agg_transfn and array_agg_array_transfn, for
which no memory release can happen anyway until the parent context is
reset (cf comments in corresponding finalfns).  So that's fine --- it is
not making any bloat case any worse, at least.

I'm not sure about whether reducing the initial Datum array size
across-the-board is a good thing or not.  One obvious hack to avoid
unexpected side effects on other callers would be

        astate->alen = (subcontext ? 64 : 8);

The larger initial size is basically free when subcontext is true, anyway,
considering it will be coming out of an 8K initial subcontext allocation.

This still leaves us wondering whether the smaller initial size will
hurt array_agg itself, but that's at least capable of being tested
with a reasonably small number of test cases.

I strongly object to removing initArrayResultArr's element_type argument.
That's got nothing to do with the stated purpose of the patch, and it
forces a non-too-cheap catalog lookup to be done inside
initArrayResultArr.  It's true that some of the current call sites are
just doing the same lookup themselves anyway, but we should not foreclose
the possibility that the caller has the data already (as some others do)
or is able to cache it across multiple uses.  A possible compromise is
to add the convention that callers can pass InvalidOid if they want
initArrayResultArr to do the lookup.  (In any case, the function's header
comment needs adjustment if the API spec changes.)

Other than that, I think the API changes here are OK.  Adding a new
argument to existing functions is something we do all the time, and it's
clear how to modify any callers to get the same behavior as before.
We could perhaps clean things up with a more invasive redefinition, but
I doubt it's worth inflicting more pain on third party callers.

Another thing I'd change is this:
 
+       /* we can only release the context if it's a private one. */
+       Assert(! (release && !astate->private_cxt));
+
        /* Clean up all the junk */
        if (release)
                MemoryContextDelete(astate->mcontext);

Why not this way:

        /* Clean up all the junk */
        if (release)
+       {
+               Assert(astate->private_cxt);
                MemoryContextDelete(astate->mcontext);
+       }

Seems a lot more understandable, and less code too.

I concur with the concerns that the comments could do with more work,
but haven't attempted to improve them myself.

                        regards, tom lane


-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PATCH: decreasing memory needlessly consumed by array_agg

Reply via email to