On Tue, 2020-04-07 at 11:20 -0700, Jeff Davis wrote:
> Now that we have Disk-based Hash Aggregation, there are a lot more
> situations where the planner can choose HashAgg. The
> enable_hashagg_disk GUC, if set to true, chooses HashAgg based on
> costing. If false, it only generates a HashAgg path if it thinks it
> will fit in work_mem, similar to the old behavior (though it wlil now
> spill to disk if the planner was wrong about it fitting in work_mem).
> The current default is true.
> 
> I expect this to be a win in a lot of cases, obviously. But as with
> any
> planner change, it will be wrong sometimes. We may want to be
> conservative and set the default to false depending on the experience
> during beta. I'm inclined to leave it as true for now though, because
> that will give us better information upon which to base any decision.

A compromise may be to multiply the disk costs for HashAgg by, e.g. a
1.5 - 2X penalty. That would make the plan changes less abrupt, and may
mitigate some of the concerns about I/O patterns that Tomas raised
here:


https://www.postgresql.org/message-id/20200519151202.u2p2gpiawoaznsv2@development

The issues were improved a lot, but it will take us a while to really
tune the IO behavior as well as Sort.

Regards,
        Jeff Davis




Reply via email to