Re: Memory-Bounded Hash Aggregation

Jeff Davis Sun, 22 Dec 2019 15:34:53 -0800

On Sat, 2019-12-14 at 18:32 +0100, Tomas Vondra wrote:
> So I think we're not costing the batching properly / at all.


Thank you for all of the testing! I think the results are good: even
for cases where HashAgg is the wrong choice, it's not too bad. You're
right that costing is not done, and when it is, I think it will avoid
these bad choices most of the time.

> A couple more comments:
> 
> 1) IMHO we should rename hashagg_mem_overflow to
> enable_hashagg_overflow
> or something like that. I think that describes the GUC purpose better
> (and it's more consistent with enable_hashagg_spill).

The other enable_* GUCs are all planner GUCs, so I named this one
differently to stand out as an executor GUC.

> 2) show_hashagg_info
> 
> I think there's a missing space after ":" here:
> 
>                               "  Batches: %d  Disk Usage:%ldkB",
> 
> and maybe we should use just "Disk:" just like in we do for sort:

Done, thank you.

> 3) I'm not quite sure what to think about the JIT recompile we do for
> EEOP_AGG_INIT_TRANS_SPILLED etc. I'm no llvm/jit expert, but do we do
> that for some other existing cases?

Andres asked for that explicitly to avoid branches in the non-spilling
code path (or at least branches that are likely to be mispredicted).

Regards,
        Jeff Davis

Re: Memory-Bounded Hash Aggregation

Reply via email to