On 2018-06-06 17:17:52 -0700, Andres Freund wrote: > On 2018-06-07 12:11:37 +1200, David Rowley wrote: > > On 7 June 2018 at 08:11, Tomas Vondra <tomas.von...@2ndquadrant.com> wrote: > > > On 06/06/2018 04:11 PM, Andres Freund wrote: > > >> Consider e.g. a scheme where we'd switch from hashed aggregation to > > >> sorted aggregation due to memory limits, but already have a number of > > >> transition values in the hash table. Whenever the size of the transition > > >> values in the hashtable exceeds memory size, we write one of them to the > > >> tuplesort (with serialized transition value). From then on further input > > >> rows for that group would only be written to the tuplesort, as the group > > >> isn't present in the hashtable anymore. > > >> > > > > > > Ah, so you're suggesting that during the second pass we'd deserialize > > > the transition value and then add the tuples to it, instead of building > > > a new transition value. Got it. > > > > Having to deserialize every time we add a new tuple sounds terrible > > from a performance point of view. > > I didn't mean that we do that, and I don't think David understood it as > that either. I was talking about the approach where the second pass is a > sort rather than hash based aggregation. Then we would *not* need to > deserialize more than exactly once.
s/David/Tomas/, obviously. Sorry, it's been a long day. Greetings, Andres Freund