On Tue, Jul 14, 2020 at 6:49 PM Peter Geoghegan <p...@bowt.ie> wrote: > Maybe I missed your point here. The problem is not so much that we'll > get HashAggs that spill -- there is nothing intrinsically wrong with > that. While it's true that the I/O pattern is not as sequential as a > similar group agg + sort, that doesn't seem like the really important > factor here. The really important factor is that in-memory HashAggs > can be blazingly fast relative to *any* alternative strategy -- be it > a HashAgg that spills, or a group aggregate + sort that doesn't spill, > whatever. We're mostly concerned about keeping the one available fast > strategy than we are about getting a new, generally slow strategy.
I don't know; it depends. Like, if the less-sequential I/O pattern that is caused by a HashAgg is not really any slower than a Sort+GroupAgg, then whatever. The planner might as well try a HashAgg - because it will be fast if it stays in memory - and if it doesn't work out, we've lost little by trying. But if a Sort+GroupAgg is noticeably faster than a HashAgg that ends up spilling, then there is a potential regression. I thought we had evidence that this was a real problem, but if that's not the case, then I think we're fine as far as v13 goes. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company