> I understood you reasoning of why startup_cost = input_startup_cost and not
> input_total_cost for aggregation by sorting. But what I didn't understand is
> how come higher startup cost for aggregation by sorting would force hash
> aggregation to be chosen? I am not clear about this part.
See this comment in cost_agg()
* Note: in this cost model, AGG_SORTED and AGG_HASHED have exactly the
* same total CPU cost, but AGG_SORTED has lower startup cost. If the
* input path is already sorted appropriately, AGG_SORTED should be
* preferred (since it has no risk of memory overflow).
AFAIU, if the input is already sorted, aggregation by sorting and
aggregation by hashing will have almost same cost, the startup cost of
AGG_SORTED being lower than AGG_HASHED. Because of lower startup cost,
AGG_SORTED gets chosen by add_path() over AGG_HASHED path.
The Postgres Database Company
Sent via pgsql-hackers mailing list (firstname.lastname@example.org)
To make changes to your subscription: