On Thu, Jun 25, 2020 at 12:24:29AM +1200, David Rowley wrote: > On Wed, 24 Jun 2020 at 21:06, Bruce Momjian <br...@momjian.us> wrote: > > I > > don't remember anyone complaining about spills to disk during merge > > join, so I am unclear why we would need a such control for hash join. > > Hash aggregate, you mean? The reason is that upgrading to PG13 can
Yes, sorry. > cause a performance regression for an underestimated ndistinct on the > GROUP BY clause and cause hash aggregate to spill to disk where it > previously did everything in RAM. Sure, that behaviour was never > what we wanted to happen, Jeff has fixed that now, but the fact > remains that this does happen in the real world quite often and people > often get away with it, likey because work_mem is generally set to > some very conservative value. Of course, there's also a bunch of > people that have been bitten by OOM due to this too. The "neverspill" > wouldn't be for those people. Certainly, it's possible that we just > tell these people to increase work_mem for this query, that way they > can set it to something reasonable and still get spilling if it's > really needed to save them from OOM, but the problem there is that > it's not very easy to go and set work_mem for a certain query. Well, my point is that merge join works that way, and no one has needed a knob to avoid mergejoin if it is going to spill to disk. If they are adjusting work_mem to prevent spill of merge join, they can do the same for hash agg. We just need to document this in the release notes. -- Bruce Momjian <br...@momjian.us> https://momjian.us EnterpriseDB https://enterprisedb.com The usefulness of a cup is in its emptiness, Bruce Lee