Re: [HACKERS] Optimize ORDER BY ... LIMIT

Martijn van Oosterhout Fri, 15 Sep 2006 12:23:12 -0700

On Fri, Sep 15, 2006 at 05:30:27PM +0100, Gregory Stark wrote:
> Also, because heap sort is slower than qsort (on average anyways) it makes
> sense to not bother with the heap until the number of tuples grows well beyond
> the limit or until it would otherwise spill to disk.


The thought that comes to mind is to leave the sorting as is, but
change the code that writes to the tapes to stop writing once it hits
the limit. So each tape will never have more than N tuples, where N is
the limit. This would be fairly unobtrusive because none of the other
code actually needs to care.

> Alternatively we could have Limit(Sort()), Unique(Sort()), and
> Limit(Unique(Sort())) be handled by new types of Sort nodes entirely and not
> introduce the Limit and Unique nodes at all. 

I don't think it's easy to merge Unique and Sort, mainly because the
fields you're doing the Unique on are probably not the fields you're
sorting on, so you're probably not saving much.

However, merging Unique/Distinct/GroupBy is another avenue that has
been considered.

In general LIMIT is not handled bad because we don't execute further
once we have the number of tuples. Only nodes that Materialize are a
problem, basically Sort being the common one.

> Or am I overthinking this and having some state nodes peek inside other state
> nodes is normal?

I don't think so. In general the parser and planner poke around quite a
bit, but once the optimizer has been over it, the plan has to be
static, for rescans, backward scans, executing stored plans, etc.

Have a nice day,
-- 
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to 
> litigate.

signature.asc
Description: Digital signature

Re: [HACKERS] Optimize ORDER BY ... LIMIT

Reply via email to