On 04/07/2018 06:23 PM, Tom Lane wrote:
> Teodor Sigaev <teo...@sigaev.ru> writes:
>>> I dunno, how would you estimate whether this is actually a win or not?
>>> I don't think our model of sort costs is anywhere near refined enough
>>> or accurate enough to reliably predict whether this is better than
>>> just doing it in one step. Even if the cost model is good, it's not
>>> going to be better than our statistics about the number/size of the
>>> groups in the first column(s), and that's a notoriously unreliable stat.
>> I think that improvement in cost calculation of sort should be a
>> separate patch, not directly connected to this one. Postpone patches
>> till other part will be ready to get max improvement for postponed ones
>> doesn't seem to me very good, especially if it suggests some improvement
>> right now.
> No, you misunderstand the point of my argument. Without a reasonably
> reliable cost model, this patch could easily make performance *worse*
> not better for many people, due to choosing incremental-sort plans
> where they were really a loss.
Yeah. Essentially the patch could push the planner to pick a path that
has low startup cost (and very high total cost), assuming it'll only
need to read small part of the input. But if the number of groups in the
input is low (perhaps just one huge group), that would be a regression.
> If we were at the start of a development cycle and work were being
> promised to be done later in the cycle to improve the planning aspect,
> I'd be more charitable about it. But this isn't merely the end of a
> cycle, it's the *last day*. Now is not the time to commit stuff that
> needs, or even just might need, follow-on work.
+1 to that
FWIW I'm willing to spend some time on the patch for PG12, particularly
on the planner / costing part. The potential gains are too interesting.
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services