On Sat, 2019-12-14 at 18:32 +0100, Tomas Vondra wrote: > So I think we're not costing the batching properly / at all.
Thank you for all of the testing! I think the results are good: even for cases where HashAgg is the wrong choice, it's not too bad. You're right that costing is not done, and when it is, I think it will avoid these bad choices most of the time. > A couple more comments: > > 1) IMHO we should rename hashagg_mem_overflow to > enable_hashagg_overflow > or something like that. I think that describes the GUC purpose better > (and it's more consistent with enable_hashagg_spill). The other enable_* GUCs are all planner GUCs, so I named this one differently to stand out as an executor GUC. > 2) show_hashagg_info > > I think there's a missing space after ":" here: > > " Batches: %d Disk Usage:%ldkB", > > and maybe we should use just "Disk:" just like in we do for sort: Done, thank you. > 3) I'm not quite sure what to think about the JIT recompile we do for > EEOP_AGG_INIT_TRANS_SPILLED etc. I'm no llvm/jit expert, but do we do > that for some other existing cases? Andres asked for that explicitly to avoid branches in the non-spilling code path (or at least branches that are likely to be mispredicted). Regards, Jeff Davis