Re: Default setting for enable_hashagg_disk

Tomas Vondra Sun, 26 Jul 2020 11:35:11 -0700

On Sat, Jul 25, 2020 at 05:13:00PM -0700, Peter Geoghegan wrote:

On Sat, Jul 25, 2020 at 5:05 PM Tomas Vondra
<tomas.von...@2ndquadrant.com> wrote:

I'm not sure what you mean by "reported memory usage doesn't reflect the
space used for transition state"? Surely it does include that, we've
built the memory accounting stuff pretty much exactly to do that.


I think it's pretty clear what's happening - in the sorted case there's
only a single group getting new values at any moment, so when we decide
to spill we'll only add rows to that group and everything else will be
spilled to disk.


Right.

In the unsorted case however we manage to initialize all groups in the
hash table, but at that point the groups are tiny an fit into work_mem.
As we process more and more data the groups grow, but we can't evict
them - at the moment we don't have that capability. So we end up
processing everything in memory, but significantly exceeding work_mem.


work_mem was set to 200MB, which is more than the reported "Peak
Memory Usage: 1605334kB". So either the random case significantly


That's 1.6GB, if I read it right. Which is more than 200MB ;-)

exceeds work_mem and the "Peak Memory Usage" accounting is wrong
(because it doesn't report this excess), or the random case really
doesn't exceed work_mem but has a surprising advantage over the sorted
case.

--
Peter Geoghegan


--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Default setting for enable_hashagg_disk

Reply via email to