Re: Default setting for enable_hashagg_disk

Amit Kapila Sat, 27 Jun 2020 03:01:34 -0700

On Thu, Jun 25, 2020 at 12:59 AM Robert Haas <robertmh...@gmail.com> wrote:
>
> So, I don't think we can wire in a constant like 10x. That's really
> unprincipled and I think it's a bad idea. What we could do, though, is
> replace the existing Boolean-valued GUC with a new GUC that controls
> the size at which the aggregate spills. The default could be -1,
> meaning work_mem, but a user could configure a larger value if desired
> (presumably, we would just treat a value smaller than work_mem as
> work_mem, and document the same).
>
> I think that's actually pretty appealing. Separating the memory we
> plan to use from the memory we're willing to use before spilling seems
> like a good idea in general, and I think we should probably also do it
> in other places - like sorts.
>


+1.  I also think GUC on these lines could help not only the problem
being discussed here but in other cases as well.  However, I think the
real question is do we want to design/implement it for PG13?  It seems
to me at this stage we don't have a clear understanding of what
percentage of real-world cases will get impacted due to the new
behavior of hash aggregates.  We want to provide some mechanism as a
safety net to avoid problems that users might face which is not a bad
idea but what if we wait and see the real impact of this?  Is it too
bad to provide a GUC later in back-branch if we see users face such
problems quite often?  I think the advantage of delaying it is that we
might see some real problems (like where hash aggregate is not a good
choice) which can be fixed via the costing model.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: Default setting for enable_hashagg_disk

Reply via email to