On Sat, Jul 11, 2020 at 4:23 PM Tomas Vondra <tomas.von...@2ndquadrant.com> wrote: > I find that example rather suspicious. I mean, what exactly in the > GroupAgg plan would consume this memory? Surely it'd have to be some > node below the grouping, but sort shouldn't do that, no? > > Seems strange.
Well, I imagine hash aggregate manages to use much less memory than the equivalent groupagg's sort, even though to the optimizer it appears as if hash agg should end up using more memory (which is not allowed by the optimizer when it exceeds work_mem, regardless of whether or not it's faster). It may also be relevant that Hash agg can use less memory simply by being faster. Going faster could easily reduce the memory usage for the system as a whole, even when you assume individual group agg nodes use more memory for as long as they run. So in-memory hash agg is effectively less memory hungry. It's not a great example of a specific case that we'd regress by not having hash_mem/hash_mem_multiplier. It's an overestimate where older releases accidentally got a bad, slow plan, not an underestimate where older releases "lived beyond their means but got away with it" by getting a good, fast plan. ISTM that the example is a good example of the strange dynamics involved. > I agree grouping estimates are often quite off, and I kinda agree with > introducing hash_mem (or at least with the concept that hashing is more > sensitive to amount of memory than sort). Not sure it's the right espace > hatch to the hashagg spill problem, but maybe it is. The hash_mem/hash_mem_multiplier proposal aims to fix the problem directly, and not be an escape hatch, because we don't like escape hatches. I think that that probably fixes many or most of the problems in practice, at least assuming that the admin is willing to tune it. But a small number of remaining installations may still need a "true" escape hatch. There is an argument for having both, though I hope that the escape hatch can be avoided. -- Peter Geoghegan