On Mon, Jun 29, 2020 at 2:46 PM Peter Geoghegan <p...@bowt.ie> wrote: > On Mon, Jun 29, 2020 at 2:22 PM Tomas Vondra > <tomas.von...@2ndquadrant.com> wrote: > > I agree with this, and I'm mostly OK with having hash_mem. In fact, from > > the proposals in this thread I like it the most - as long as it's used > > both during planning and execution. It's a pretty clear solution. > > Great. > > It's not trivial to write the patch, since there are a few tricky > cases. For example, maybe there is some subtlety in nodeAgg.c with > AGG_MIXED cases.
Attached is an attempt at this. I have not been particularly thorough, since it is still not completely clear that the hash_mem proposal has a serious chance of resolving the "many users rely on hashagg exceeding work_mem, regardless of whether or not that is the intended behavior in Postgres 12" problem. But at least we have a patch now, and so have some idea of how invasive this will have to be. We also have something to test. Note that I created a new open item for this "maybe we need something like a hash_mem GUC now" problem today. To recap, this thread started out being a discussion about the enable_hashagg_disk GUC, which seems like a distinct problem to me. It won't make much sense to return to discussing the original problem before we have a solution to this other problem (the problem that I propose to address by inventing hash_mem). About the patch: The patch adds hash_mem, which is just another work_mem-like GUC that the patch has us use in certain cases -- cases where the work area is a hash table (but not when it's a sort, or some kind of bitset, or anything else). I still think that the new GUC should work as a multiplier of work_mem, or something else along those lines, though for now it's just an independent work_mem used for hashing. I bring it up again because I'm concerned about users that upgrade to Postgres 13 incautiously, and find that hashing uses *less* memory than before. Many users probably get away with setting work_mem quite high across the board. At the very least, hash_mem should be ignored when it's set to below work_mem (which isn't what the patch does). It might have made more sense to call the new GUC hash_work_mem instead of hash_mem. I don't feel strongly about the name. Again, this is just a starting point for further discussion. -- Peter Geoghegan
0001-Add-a-GUC-that-limits-memory-use-for-hash-tables.patch
Description: Binary data