Re: Ordered Partitioned Table Scans

Simon Riggs Fri, 22 Mar 2019 08:29:20 -0700

On Fri, 22 Mar 2019 at 11:12, Tom Lane <t...@sss.pgh.pa.us> wrote:

> David Rowley <david.row...@2ndquadrant.com> writes:
> > On Sat, 9 Mar 2019 at 10:52, Tom Lane <t...@sss.pgh.pa.us> wrote:
> >>> This can be a huge win for queries of the form "ORDER BY partkey LIMIT
> >>> x".  Even if the first subpath(s) aren't natively ordered, not all of
> >>> the sorts should actually be performed.
>
> >> [ shrug... ] We've got no realistic chance of estimating such situations
> >> properly, so I'd have no confidence in a plan choice based on such a
> >> thing.  Nor do I believe that this case is all that important.
>
> > Wondering if you can provide more details on why you think there's no
> > realistic chance of the planner costing these cases correctly?
>
> The reason why I'm skeptical about LIMIT with a plan of the form
> append-some-sorts-together is that there are going to be large
> discontinuities in the cost-vs-number-of-rows-returned graph,
> ie, every time you finish one child plan and start the next one,
> there'll be a hiccup while the sort happens.  This is something
> that our plan cost approximation (startup cost followed by linear
> output up to total cost) can't even represent.  Sticking a
> LIMIT on top of that is certainly not going to lead to any useful
> estimate of the actual cost, meaning that if you get a correct
> plan choice it'll just be by luck, and if you don't there'll be
> nothing to be done about it.
>
> If we don't incorporate that, then the situation that the planner
> will have to model is a MergeAppend with possibly some sorts in
> front of it, and it will correctly cost that as if all the sorts
> happen before any output occurs, and so you can hope that reasonable
> plan choices will ensue.
>
> I believe that the cases where a LIMIT query actually ought to go
> for a fast-start plan are where *all* the child rels have fast-start
> (non-sort) paths --- which is exactly the cases we'd get if we only
> allow "sorted" Appends when none of the inputs require a sort.
> Imagining that we'll get good results in cases where some of them
> need to be sorted is just wishful thinking.
>
> > It would be unfortunate to reject this patch based on a sentence that
> > starts with "[ shrug... ]", especially so when three people have stood
> > up and disagreed with you.
>
> I don't want to reject the patch altogether, I just want it to not
> include a special hack to allow Append rather than MergeAppend in such
> cases.  I am aware that I'm probably going to be shouted down on this
> point, but that will not change my opinion that the shouters are wrong.
>


I agree that the issue of mixing sorts at various points will make nonsense
of the startup cost/total cost results.

What you say about LIMIT is exactly right. But ISTM that LIMIT itself is
the issue there and it need more smarts to correctly calculate costs.

I don't see LIMIT costing being broken as a reason to restrict this
optimization. I would ask that we allow improvements to the important use
case of ORDER BY/LIMIT, then spend time on making LIMIT work correctly.

-- 
Simon Riggs                http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Ordered Partitioned Table Scans

Reply via email to