Re: [HACKERS] Optimize ORDER BY ... LIMIT

2006-09-16 Thread Nicolas Barbier

2006/9/16, Gregory Stark [EMAIL PROTECTED]:


Alvaro Herrera [EMAIL PROTECTED] writes:

 I don't know if this is the same thing you are talking about, but Oleg
 talked to me on the conference about partial sort, which AFAICS it's
 about the same thing you are talking about.  I think Teodor submitted a
 patch to implement it, which was rejected because of not being general
 enough.

Oof, you have a long memory. Oleg does reference such a thing in his 2002 post
that ended up resulting in the TODO item. I can't find the original patch but
I doubt any patch against 7.1 is going to be all that helpful in understanding
what to do today.

I'm also confused how he only saw a factor of 6 improvement in reading the top
100 out of a million. I would expect much better.


For example, consider the case in which 6 passes are needed to do the
full sort. Then, for a partial sort, at least the first of these
passes has to be fully executed, because one needs to read at least
all the data once to find the top n.

greetings,
Nicolas

--
Nicolas Barbier
http://www.gnu.org/philosophy/no-word-attachments.html

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Optimize ORDER BY ... LIMIT

2006-09-15 Thread Tom Lane
Gregory Stark [EMAIL PROTECTED] writes:
 I've been looking at doing the following TODO item:
 Allow ORDER BY ... LIMIT # to select high/low value without sort or index
 using a sequential scan for highest/lowest values

 I think this is pretty important to cover at some point because really _not_
 doing this just wrong.

I can't get all *that* excited about it, since an index solves the
problem.

 The way I see to do this is to still use a Sort node and use a tuplesort but
 to arrange to get the information of the maximum number of tuples needed to
 the tuplesort so it can throw out tuples as it sorts.

The implementation that was proposed in the earlier discussion did not
involve hacking the sort code beyond recognition ;-).

I believe a better way to think about this would be as an aggregate that
remembers the top N rows.  It can't quite be an aggregate as it stands
(unless we want to invent aggregates that can return SETOF?)  but I
think there might be some useful overlap with the SQL2003
window-function concept.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Optimize ORDER BY ... LIMIT

2006-09-15 Thread Andrew Dunstan

Tom Lane wrote:
(unless we want to invent aggregates that can return SETOF?)  
  


Doesn't sound like a bad idea at all ...

cheers

andrew


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Optimize ORDER BY ... LIMIT

2006-09-15 Thread Gregory Stark
Tom Lane [EMAIL PROTECTED] writes:

 I believe a better way to think about this would be as an aggregate that
 remembers the top N rows.  

Wouldn't such a thing just be a reimplementation of a tuplestore though? I
mean, it's storing tuples you feed it, sorting them, and spitting them back
out in sorted order.

What would you do if the set of tuples turned out to be larger than you
expected and not fit in memory? Create a tuplesort and pass them on to it?

I've already looked at tuplesort and the changes there are minimal. The hard
part is what to do in the planner and executor to get the information to the
tuplestore. Do we want the plan to look the way it does now or use some new
sort of node that consolidates the limit and the sort in the same place.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Optimize ORDER BY ... LIMIT

2006-09-15 Thread Martijn van Oosterhout
On Fri, Sep 15, 2006 at 05:30:27PM +0100, Gregory Stark wrote:
 Also, because heap sort is slower than qsort (on average anyways) it makes
 sense to not bother with the heap until the number of tuples grows well beyond
 the limit or until it would otherwise spill to disk.

The thought that comes to mind is to leave the sorting as is, but
change the code that writes to the tapes to stop writing once it hits
the limit. So each tape will never have more than N tuples, where N is
the limit. This would be fairly unobtrusive because none of the other
code actually needs to care.

 Alternatively we could have Limit(Sort()), Unique(Sort()), and
 Limit(Unique(Sort())) be handled by new types of Sort nodes entirely and not
 introduce the Limit and Unique nodes at all. 

I don't think it's easy to merge Unique and Sort, mainly because the
fields you're doing the Unique on are probably not the fields you're
sorting on, so you're probably not saving much.

However, merging Unique/Distinct/GroupBy is another avenue that has
been considered.

In general LIMIT is not handled bad because we don't execute further
once we have the number of tuples. Only nodes that Materialize are a
problem, basically Sort being the common one.

 Or am I overthinking this and having some state nodes peek inside other state
 nodes is normal?

I don't think so. In general the parser and planner poke around quite a
bit, but once the optimizer has been over it, the plan has to be
static, for rescans, backward scans, executing stored plans, etc.

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 From each according to his ability. To each according to his ability to 
 litigate.


signature.asc
Description: Digital signature


Re: [HACKERS] Optimize ORDER BY ... LIMIT

2006-09-15 Thread Gregory Stark
Tom Lane [EMAIL PROTECTED] writes:

 Gregory Stark [EMAIL PROTECTED] writes:

 I think this is pretty important to cover at some point because really _not_
 doing this just wrong.

 I can't get all *that* excited about it, since an index solves the
 problem.

Well I'm not all *that* excited about it either, it's just another plan and
there are an infinite number of possible plans out there we could infinite for
various corner cases.

But just in case it's not clear for anyone the usual use case for this paging
results on a web page. As much as I normally try to convince people they don't
want to do it that way they usually do end up with it implemented using
limit/offset. And Postgres currently is absolutely *awful* at running those
queries.

Often the killer requirement that makes it infeasible to create an index is
precisely that they want to be able to sort on any of a long list of possible
keys. Creating dozens of keys on every table isn't too appealing.

And in any case the query is often a join where the data in the sort key isn't
even all coming from the same table or where you need to use other indexes to
fetch the data prior to the sort.

I won't discourage anyone from working on OLAP queries and this is indeed a
similar idea. I suspect the same functionality in tuplesort of being able to
set a maximum number of tuples to keep will be useful there too. 

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Optimize ORDER BY ... LIMIT

2006-09-15 Thread Alvaro Herrera
Gregory Stark wrote:
 Tom Lane [EMAIL PROTECTED] writes:
 
  I believe a better way to think about this would be as an aggregate that
  remembers the top N rows.  
 
 Wouldn't such a thing just be a reimplementation of a tuplestore though? I
 mean, it's storing tuples you feed it, sorting them, and spitting them back
 out in sorted order.

I don't know if this is the same thing you are talking about, but Oleg
talked to me on the conference about partial sort, which AFAICS it's
about the same thing you are talking about.  I think Teodor submitted a
patch to implement it, which was rejected because of not being general
enough.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Optimize ORDER BY ... LIMIT

2006-09-15 Thread Gregory Stark

Martijn van Oosterhout kleptog@svana.org writes:

 On Fri, Sep 15, 2006 at 05:30:27PM +0100, Gregory Stark wrote:
 Also, because heap sort is slower than qsort (on average anyways) it makes
 sense to not bother with the heap until the number of tuples grows well 
 beyond
 the limit or until it would otherwise spill to disk.

 The thought that comes to mind is to leave the sorting as is, but
 change the code that writes to the tapes to stop writing once it hits
 the limit. So each tape will never have more than N tuples, where N is
 the limit. This would be fairly unobtrusive because none of the other
 code actually needs to care.

I'm sorry, I forgot to mention that part of my analysis. Once you reach disk
any chance of optimising the limit case is pretty much out the window. You
could trim some tuples from each tape but unless you're sorting truly
stupendous amounts of data I doubt it would really help much.

I think it only makes sense to look at the in-memory case. Instead of qsorting
thousands of records or, worse, spilling millions of records to disk and doing
an external sort only to use only the top 10 and throw the rest away, we throw
tuples away before they accumulate in memory in the first place.

 Alternatively we could have Limit(Sort()), Unique(Sort()), and
 Limit(Unique(Sort())) be handled by new types of Sort nodes entirely and not
 introduce the Limit and Unique nodes at all. 

 I don't think it's easy to merge Unique and Sort, mainly because the
 fields you're doing the Unique on are probably not the fields you're
 sorting on, so you're probably not saving much.

On the contrary I think the vast majority of the time you have a Unique(Sort)
it will be the same key because it will be caused by a SELECT DISTINCT. Now
that I actually test it I see there are more nodes that could do implement
this (namely GroupAgg) so I'm thinking more and more about having an abstract
way to pass information down through the nodes rather than handle just
Limit/Sort specifically.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Optimize ORDER BY ... LIMIT

2006-09-15 Thread mark
On Fri, Sep 15, 2006 at 08:22:50PM +0100, Gregory Stark wrote:
 But just in case it's not clear for anyone the usual use case for
 this paging results on a web page. As much as I normally try to
 convince people they don't want to do it that way they usually do
 end up with it implemented using limit/offset. And Postgres
 currently is absolutely *awful* at running those queries.

I'm curious, as I may be such an offender. What alternatives exist?

I think you mean the concept of showing a page of information that
you can navigate forwards and backwards a page, or select a range.

What alternatives to limit/offset exist? If there are thousands or
more results, I have trouble with an idea that the entire results
should be queried, and cached, displaying only a fraction.

I think most or all of the times I do this, an index is available,
so perhaps that is why I don't find this issue problematic?

For implementation, I think something simple is best:

- limit X offset Y

This means keeping a set of X+Y tuples as follows:

1) If set of X+Y tuples still has room, insert using a binary search
   that retains the ordering characteristics that would be had if
   limit/offset had not been used.

2) If the row is less than the X+Y'th tuple, remove the X+Y'th
   tuple from the set, and insert the row as per 1).

3) Ignore the tuple.

At the end, you return from the set starting at Y+1, and ending at Y+X.

If X+Y tuples would take up too much memory, this plan should be
skipped, and the general routines used instead.

Cheers,
mark

-- 
[EMAIL PROTECTED] / [EMAIL PROTECTED] / [EMAIL PROTECTED] 
__
.  .  _  ._  . .   .__.  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/|_ |\/|  |  |_  |   |/  |_   | 
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada

  One ring to rule them all, one ring to find them, one ring to bring them all
   and in the darkness bind them...

   http://mark.mielke.cc/


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Optimize ORDER BY ... LIMIT

2006-09-15 Thread Gregory Stark

[EMAIL PROTECTED] writes:

 On Fri, Sep 15, 2006 at 08:22:50PM +0100, Gregory Stark wrote:

 I'm curious, as I may be such an offender. What alternatives exist?

 I think you mean the concept of showing a page of information that
 you can navigate forwards and backwards a page, or select a range.

 What alternatives to limit/offset exist? If there are thousands or
 more results, I have trouble with an idea that the entire results
 should be queried, and cached, displaying only a fraction.

 I think most or all of the times I do this, an index is available,
 so perhaps that is why I don't find this issue problematic?

If you have a unique index and instead of using OFFSET you pass along the last
key of the previous page then you can use a WHERE clause on the indexed column
to go straight to the correct page rather than using OFFSET.

So for example if you're displaying bank transactions sorted by transaction_id
you have the next page button pass along the start_transaction_id=nnn
where nnn is the last transaction_id of the previous page. Then on the next
page you do a query with WHERE transaction_id  ? and pass that column. You
still use ORDER BY transaction_id and LIMIT.

This has the upside that your query always takes the same amount of time.

Using OFFSET means later pages take longer, possibly much longer, than earlier
pages. Possibly so much longer that it causes enough i/o to bring down your
web server etc.

It does have lots of downsides as well. You cannot provide direct links to the
later pages aside from the next page. It's difficult to provide a proper
previous page button. etc. Early in the web's history these were reasonable
trade-offs but nowadays it's hard to convince people that their $bazillion
machine can't handle sorting a couple thousand records for each page view and
they should sacrifice the user experience for that. So I've pretty much given
up on that battle.

 For implementation, I think something simple is best:
[...]

You just described using an insertion sort. Even if I went with insertion sort
instead of heap sort I think it makes sense to do it inside tuplesort. 

 If X+Y tuples would take up too much memory, this plan should be
 skipped, and the general routines used instead.

And that's a big part of why...


-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Optimize ORDER BY ... LIMIT

2006-09-15 Thread Gregory Stark
Alvaro Herrera [EMAIL PROTECTED] writes:

 I don't know if this is the same thing you are talking about, but Oleg
 talked to me on the conference about partial sort, which AFAICS it's
 about the same thing you are talking about.  I think Teodor submitted a
 patch to implement it, which was rejected because of not being general
 enough.


Oof, you have a long memory. Oleg does reference such a thing in his 2002 post
that ended up resulting in the TODO item. I can't find the original patch but
I doubt any patch against 7.1 is going to be all that helpful in understanding
what to do today.

I'm also confused how he only saw a factor of 6 improvement in reading the top
100 out of a million. I would expect much better.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [SPAM?] Re: [HACKERS] Optimize ORDER BY ... LIMIT

2006-09-15 Thread mark
On Fri, Sep 15, 2006 at 10:06:16PM +0100, Gregory Stark wrote:
  I'm curious, as I may be such an offender. What alternatives exist?
  ...
  What alternatives to limit/offset exist? If there are thousands or
  more results, I have trouble with an idea that the entire results
  should be queried, and cached, displaying only a fraction.

 If you have a unique index and instead of using OFFSET you pass
 along the last key of the previous page then you can use a WHERE
 clause on the indexed column to go straight to the correct page
 rather than using OFFSET.  So for example if you're displaying bank
 transactions sorted by transaction_id you have the next page
 button pass along the start_transaction_id=nnn where nnn is the
 last transaction_id of the previous page. Then on the next page you
 do a query with WHERE transaction_id  ? and pass that column. You
 still use ORDER BY transaction_id and LIMIT.

I found benefits to doing things this way that were not related to
performance. If the number of items leading up to your page changes,
remembering the offset can result in listing a completely different
page than you intended when paging forward or backwards. On my pages,
I prefer to define one of the items as the item I am looking at, and
page seeking is always +/- 1 page from that item. This means that I
am pretty close to what you are suggesting - except - because I do
this for functional reasons, and not for performance reasons, I am
doing something worse.

I use COUNT(*) and WHERE as you describe above to map this identifier
to an offset, and then a second SELECT with LIMIT/OFFSET to describe
the object and the those that follow on the page.

According to your suggestion, I think this means I should track the
identifier with the last offset, displaying the offset to the user for
information purposes only, not using it for any queries, and then use
WHERE and LIMIT?

I tried this out. EXPLAIN ANALYZE tells me that for a random
offset=200, limit=20 case I tried, the simple change changes it from
index scanning 207 rows to find 7 rows, to index scanning 7 rows to
find 7 rows. Sweet. Unfortunately, the time to complete is unchanged
around 1.3+/-0.2 milliseconds. Looks like my system has bigger
bottlenecks. :-)

Thanks,
mark

-- 
[EMAIL PROTECTED] / [EMAIL PROTECTED] / [EMAIL PROTECTED] 
__
.  .  _  ._  . .   .__.  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/|_ |\/|  |  |_  |   |/  |_   | 
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada

  One ring to rule them all, one ring to find them, one ring to bring them all
   and in the darkness bind them...

   http://mark.mielke.cc/


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster