On Sat, Jun 27, 2020 at 3:00 AM Amit Kapila <amit.kapil...@gmail.com> wrote: > I think the advantage of delaying it is that we > might see some real problems (like where hash aggregate is not a good > choice) which can be fixed via the costing model.
I think any problem that might come up with the costing is best thought of as a distinct problem. This thread is mostly about the problem of users getting fewer in-memory hash aggregates compared to a previous release running the same application (though there has been some discussion of the other problem, too [1], but it's thought to be less serious). The problem is that affected users were theoretically never entitled to the performance they came to rely on, and yet there is good reason to think that hash aggregate really should be entitled to more memory. They won't care that they were theoretically never entitled to that performance, though -- they *liked* the fact that hash agg could cheat. And they'll dislike the fact that this cannot be corrected by tuning work_mem, since that affects all node types that consume work_mem, not just hash aggregate -- that could cause OOMs for them. There are two or three similar ideas under discussion that might fix the problem. They all seem to involve admitting that hash aggregate's "cheating" might actually have been a good thing all along (even though giving hash aggregate much much more memory than other nodes is terrible), and giving hash aggregate license to "cheat openly". Note that the problem isn't exactly a problem with the hash aggregate spilling patch. You could think of the problem as a pre-existing issue -- a failure to give more memory to hash aggregate, which really should be entitled to more memory. Jeff's patch just made the issue more obvious. [1] https://postgr.es/m/20200624191433.5gnqgrxfmucex...@alap3.anarazel.de -- Peter Geoghegan