On Sun, Jun 28, 2020 at 05:40:16PM -0700, Peter Geoghegan wrote:
> I think any problem that might come up with the costing is best
> thought of as a distinct problem. This thread is mostly about the
> problem of users getting fewer in-memory hash aggregates compared to a
> previous release running the same application (though there has been
> some discussion of the other problem, too [1], but it's thought to be
> less serious).
> 
> The problem is that affected users were theoretically never entitled
> to the performance they came to rely on, and yet there is good reason
> to think that hash aggregate really should be entitled to more memory.
> They won't care that they were theoretically never entitled to that
> performance, though -- they *liked* the fact that hash agg could
> cheat. And they'll dislike the fact that this cannot be corrected by
> tuning work_mem, since that affects all node types that consume
> work_mem, not just hash aggregate -- that could cause OOMs for them.
> 
> There are two or three similar ideas under discussion that might fix
> the problem. They all seem to involve admitting that hash aggregate's
> "cheating" might actually have been a good thing all along (even
> though giving hash aggregate much much more memory than other nodes is
> terrible), and giving hash aggregate license to "cheat openly". Note
> that the problem isn't exactly a problem with the hash aggregate
> spilling patch. You could think of the problem as a pre-existing issue
> -- a failure to give more memory to hash aggregate, which really
> should be entitled to more memory. Jeff's patch just made the issue
> more obvious.

In thinking some more about this, I came out with two ideas. First, in
pre-PG 13, we didn't choose hash_agg if we thought it would spill, but
if we misestimated and it used more work_mem, we allowed it.  The effect
of this is that if we were close, but it went over, we allowed it just
for hash_agg.  Is this something we want to codify for all node types,
i.e., choose a non-spill node type if we need a lot more than work_mem,
but then let work_mem be a soft limit if we do choose it, e.g., allow
50% over work_mem in the executor for misestimation before spill?  My
point is, do we want to use a lower work_mem for planning and a higher
one in the executor before spilling.

My second thought is from an earlier report that spilling is very
expensive, but smaller work_mem doesn't seem to hurt much.  Would we
achieve better overall performance by giving a few nodes a lot of memory
(and not spill those), and other nodes very little, rather than having
them all be the same size, and all spill?

-- 
  Bruce Momjian  <br...@momjian.us>        https://momjian.us
  EnterpriseDB                             https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee



Reply via email to