Re: Default setting for enable_hashagg_disk

Bruce Momjian Wed, 24 Jun 2020 10:13:11 -0700

On Thu, Jun 25, 2020 at 12:24:29AM +1200, David Rowley wrote:
> On Wed, 24 Jun 2020 at 21:06, Bruce Momjian <br...@momjian.us> wrote:
> > I
> > don't remember anyone complaining about spills to disk during merge
> > join, so I am unclear why we would need a such control for hash join.
> 
> Hash aggregate, you mean?   The reason is that upgrading to PG13 can


Yes, sorry.

> cause a performance regression for an underestimated ndistinct on the
> GROUP BY clause and cause hash aggregate to spill to disk where it
> previously did everything in RAM.   Sure, that behaviour was never
> what we wanted to happen, Jeff has fixed that now, but the fact
> remains that this does happen in the real world quite often and people
> often get away with it, likey because work_mem is generally set to
> some very conservative value.  Of course, there's also a bunch of
> people that have been bitten by OOM due to this too. The "neverspill"
> wouldn't be for those people.  Certainly, it's possible that we just
> tell these people to increase work_mem for this query, that way they
> can set it to something reasonable and still get spilling if it's
> really needed to save them from OOM, but the problem there is that
> it's not very easy to go and set work_mem for a certain query.

Well, my point is that merge join works that way, and no one has needed
a knob to avoid mergejoin if it is going to spill to disk.  If they are
adjusting work_mem to prevent spill of merge join, they can do the same
for hash agg.  We just need to document this in the release notes.

-- 
  Bruce Momjian  <br...@momjian.us>        https://momjian.us
  EnterpriseDB                             https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee

Re: Default setting for enable_hashagg_disk

Reply via email to