Re: [HACKERS] Optimize ORDER BY ... LIMIT
2006/9/16, Gregory Stark [EMAIL PROTECTED]: Alvaro Herrera [EMAIL PROTECTED] writes: I don't know if this is the same thing you are talking about, but Oleg talked to me on the conference about partial sort, which AFAICS it's about the same thing you are talking about. I think Teodor submitted a patch to implement it, which was rejected because of not being general enough. Oof, you have a long memory. Oleg does reference such a thing in his 2002 post that ended up resulting in the TODO item. I can't find the original patch but I doubt any patch against 7.1 is going to be all that helpful in understanding what to do today. I'm also confused how he only saw a factor of 6 improvement in reading the top 100 out of a million. I would expect much better. For example, consider the case in which 6 passes are needed to do the full sort. Then, for a partial sort, at least the first of these passes has to be fully executed, because one needs to read at least all the data once to find the top n. greetings, Nicolas -- Nicolas Barbier http://www.gnu.org/philosophy/no-word-attachments.html ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Optimize ORDER BY ... LIMIT
Gregory Stark [EMAIL PROTECTED] writes: I've been looking at doing the following TODO item: Allow ORDER BY ... LIMIT # to select high/low value without sort or index using a sequential scan for highest/lowest values I think this is pretty important to cover at some point because really _not_ doing this just wrong. I can't get all *that* excited about it, since an index solves the problem. The way I see to do this is to still use a Sort node and use a tuplesort but to arrange to get the information of the maximum number of tuples needed to the tuplesort so it can throw out tuples as it sorts. The implementation that was proposed in the earlier discussion did not involve hacking the sort code beyond recognition ;-). I believe a better way to think about this would be as an aggregate that remembers the top N rows. It can't quite be an aggregate as it stands (unless we want to invent aggregates that can return SETOF?) but I think there might be some useful overlap with the SQL2003 window-function concept. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Optimize ORDER BY ... LIMIT
Tom Lane wrote: (unless we want to invent aggregates that can return SETOF?) Doesn't sound like a bad idea at all ... cheers andrew ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Optimize ORDER BY ... LIMIT
Tom Lane [EMAIL PROTECTED] writes: I believe a better way to think about this would be as an aggregate that remembers the top N rows. Wouldn't such a thing just be a reimplementation of a tuplestore though? I mean, it's storing tuples you feed it, sorting them, and spitting them back out in sorted order. What would you do if the set of tuples turned out to be larger than you expected and not fit in memory? Create a tuplesort and pass them on to it? I've already looked at tuplesort and the changes there are minimal. The hard part is what to do in the planner and executor to get the information to the tuplestore. Do we want the plan to look the way it does now or use some new sort of node that consolidates the limit and the sort in the same place. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Optimize ORDER BY ... LIMIT
On Fri, Sep 15, 2006 at 05:30:27PM +0100, Gregory Stark wrote: Also, because heap sort is slower than qsort (on average anyways) it makes sense to not bother with the heap until the number of tuples grows well beyond the limit or until it would otherwise spill to disk. The thought that comes to mind is to leave the sorting as is, but change the code that writes to the tapes to stop writing once it hits the limit. So each tape will never have more than N tuples, where N is the limit. This would be fairly unobtrusive because none of the other code actually needs to care. Alternatively we could have Limit(Sort()), Unique(Sort()), and Limit(Unique(Sort())) be handled by new types of Sort nodes entirely and not introduce the Limit and Unique nodes at all. I don't think it's easy to merge Unique and Sort, mainly because the fields you're doing the Unique on are probably not the fields you're sorting on, so you're probably not saving much. However, merging Unique/Distinct/GroupBy is another avenue that has been considered. In general LIMIT is not handled bad because we don't execute further once we have the number of tuples. Only nodes that Materialize are a problem, basically Sort being the common one. Or am I overthinking this and having some state nodes peek inside other state nodes is normal? I don't think so. In general the parser and planner poke around quite a bit, but once the optimizer has been over it, the plan has to be static, for rescans, backward scans, executing stored plans, etc. Have a nice day, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ From each according to his ability. To each according to his ability to litigate. signature.asc Description: Digital signature
Re: [HACKERS] Optimize ORDER BY ... LIMIT
Tom Lane [EMAIL PROTECTED] writes: Gregory Stark [EMAIL PROTECTED] writes: I think this is pretty important to cover at some point because really _not_ doing this just wrong. I can't get all *that* excited about it, since an index solves the problem. Well I'm not all *that* excited about it either, it's just another plan and there are an infinite number of possible plans out there we could infinite for various corner cases. But just in case it's not clear for anyone the usual use case for this paging results on a web page. As much as I normally try to convince people they don't want to do it that way they usually do end up with it implemented using limit/offset. And Postgres currently is absolutely *awful* at running those queries. Often the killer requirement that makes it infeasible to create an index is precisely that they want to be able to sort on any of a long list of possible keys. Creating dozens of keys on every table isn't too appealing. And in any case the query is often a join where the data in the sort key isn't even all coming from the same table or where you need to use other indexes to fetch the data prior to the sort. I won't discourage anyone from working on OLAP queries and this is indeed a similar idea. I suspect the same functionality in tuplesort of being able to set a maximum number of tuples to keep will be useful there too. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Optimize ORDER BY ... LIMIT
Gregory Stark wrote: Tom Lane [EMAIL PROTECTED] writes: I believe a better way to think about this would be as an aggregate that remembers the top N rows. Wouldn't such a thing just be a reimplementation of a tuplestore though? I mean, it's storing tuples you feed it, sorting them, and spitting them back out in sorted order. I don't know if this is the same thing you are talking about, but Oleg talked to me on the conference about partial sort, which AFAICS it's about the same thing you are talking about. I think Teodor submitted a patch to implement it, which was rejected because of not being general enough. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Optimize ORDER BY ... LIMIT
Martijn van Oosterhout kleptog@svana.org writes: On Fri, Sep 15, 2006 at 05:30:27PM +0100, Gregory Stark wrote: Also, because heap sort is slower than qsort (on average anyways) it makes sense to not bother with the heap until the number of tuples grows well beyond the limit or until it would otherwise spill to disk. The thought that comes to mind is to leave the sorting as is, but change the code that writes to the tapes to stop writing once it hits the limit. So each tape will never have more than N tuples, where N is the limit. This would be fairly unobtrusive because none of the other code actually needs to care. I'm sorry, I forgot to mention that part of my analysis. Once you reach disk any chance of optimising the limit case is pretty much out the window. You could trim some tuples from each tape but unless you're sorting truly stupendous amounts of data I doubt it would really help much. I think it only makes sense to look at the in-memory case. Instead of qsorting thousands of records or, worse, spilling millions of records to disk and doing an external sort only to use only the top 10 and throw the rest away, we throw tuples away before they accumulate in memory in the first place. Alternatively we could have Limit(Sort()), Unique(Sort()), and Limit(Unique(Sort())) be handled by new types of Sort nodes entirely and not introduce the Limit and Unique nodes at all. I don't think it's easy to merge Unique and Sort, mainly because the fields you're doing the Unique on are probably not the fields you're sorting on, so you're probably not saving much. On the contrary I think the vast majority of the time you have a Unique(Sort) it will be the same key because it will be caused by a SELECT DISTINCT. Now that I actually test it I see there are more nodes that could do implement this (namely GroupAgg) so I'm thinking more and more about having an abstract way to pass information down through the nodes rather than handle just Limit/Sort specifically. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Optimize ORDER BY ... LIMIT
On Fri, Sep 15, 2006 at 08:22:50PM +0100, Gregory Stark wrote: But just in case it's not clear for anyone the usual use case for this paging results on a web page. As much as I normally try to convince people they don't want to do it that way they usually do end up with it implemented using limit/offset. And Postgres currently is absolutely *awful* at running those queries. I'm curious, as I may be such an offender. What alternatives exist? I think you mean the concept of showing a page of information that you can navigate forwards and backwards a page, or select a range. What alternatives to limit/offset exist? If there are thousands or more results, I have trouble with an idea that the entire results should be queried, and cached, displaying only a fraction. I think most or all of the times I do this, an index is available, so perhaps that is why I don't find this issue problematic? For implementation, I think something simple is best: - limit X offset Y This means keeping a set of X+Y tuples as follows: 1) If set of X+Y tuples still has room, insert using a binary search that retains the ordering characteristics that would be had if limit/offset had not been used. 2) If the row is less than the X+Y'th tuple, remove the X+Y'th tuple from the set, and insert the row as per 1). 3) Ignore the tuple. At the end, you return from the set starting at Y+1, and ending at Y+X. If X+Y tuples would take up too much memory, this plan should be skipped, and the general routines used instead. Cheers, mark -- [EMAIL PROTECTED] / [EMAIL PROTECTED] / [EMAIL PROTECTED] __ . . _ ._ . . .__. . ._. .__ . . . .__ | Neighbourhood Coder |\/| |_| |_| |/|_ |\/| | |_ | |/ |_ | | | | | | \ | \ |__ . | | .|. |__ |__ | \ |__ | Ottawa, Ontario, Canada One ring to rule them all, one ring to find them, one ring to bring them all and in the darkness bind them... http://mark.mielke.cc/ ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Optimize ORDER BY ... LIMIT
[EMAIL PROTECTED] writes: On Fri, Sep 15, 2006 at 08:22:50PM +0100, Gregory Stark wrote: I'm curious, as I may be such an offender. What alternatives exist? I think you mean the concept of showing a page of information that you can navigate forwards and backwards a page, or select a range. What alternatives to limit/offset exist? If there are thousands or more results, I have trouble with an idea that the entire results should be queried, and cached, displaying only a fraction. I think most or all of the times I do this, an index is available, so perhaps that is why I don't find this issue problematic? If you have a unique index and instead of using OFFSET you pass along the last key of the previous page then you can use a WHERE clause on the indexed column to go straight to the correct page rather than using OFFSET. So for example if you're displaying bank transactions sorted by transaction_id you have the next page button pass along the start_transaction_id=nnn where nnn is the last transaction_id of the previous page. Then on the next page you do a query with WHERE transaction_id ? and pass that column. You still use ORDER BY transaction_id and LIMIT. This has the upside that your query always takes the same amount of time. Using OFFSET means later pages take longer, possibly much longer, than earlier pages. Possibly so much longer that it causes enough i/o to bring down your web server etc. It does have lots of downsides as well. You cannot provide direct links to the later pages aside from the next page. It's difficult to provide a proper previous page button. etc. Early in the web's history these were reasonable trade-offs but nowadays it's hard to convince people that their $bazillion machine can't handle sorting a couple thousand records for each page view and they should sacrifice the user experience for that. So I've pretty much given up on that battle. For implementation, I think something simple is best: [...] You just described using an insertion sort. Even if I went with insertion sort instead of heap sort I think it makes sense to do it inside tuplesort. If X+Y tuples would take up too much memory, this plan should be skipped, and the general routines used instead. And that's a big part of why... -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Optimize ORDER BY ... LIMIT
Alvaro Herrera [EMAIL PROTECTED] writes: I don't know if this is the same thing you are talking about, but Oleg talked to me on the conference about partial sort, which AFAICS it's about the same thing you are talking about. I think Teodor submitted a patch to implement it, which was rejected because of not being general enough. Oof, you have a long memory. Oleg does reference such a thing in his 2002 post that ended up resulting in the TODO item. I can't find the original patch but I doubt any patch against 7.1 is going to be all that helpful in understanding what to do today. I'm also confused how he only saw a factor of 6 improvement in reading the top 100 out of a million. I would expect much better. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [SPAM?] Re: [HACKERS] Optimize ORDER BY ... LIMIT
On Fri, Sep 15, 2006 at 10:06:16PM +0100, Gregory Stark wrote: I'm curious, as I may be such an offender. What alternatives exist? ... What alternatives to limit/offset exist? If there are thousands or more results, I have trouble with an idea that the entire results should be queried, and cached, displaying only a fraction. If you have a unique index and instead of using OFFSET you pass along the last key of the previous page then you can use a WHERE clause on the indexed column to go straight to the correct page rather than using OFFSET. So for example if you're displaying bank transactions sorted by transaction_id you have the next page button pass along the start_transaction_id=nnn where nnn is the last transaction_id of the previous page. Then on the next page you do a query with WHERE transaction_id ? and pass that column. You still use ORDER BY transaction_id and LIMIT. I found benefits to doing things this way that were not related to performance. If the number of items leading up to your page changes, remembering the offset can result in listing a completely different page than you intended when paging forward or backwards. On my pages, I prefer to define one of the items as the item I am looking at, and page seeking is always +/- 1 page from that item. This means that I am pretty close to what you are suggesting - except - because I do this for functional reasons, and not for performance reasons, I am doing something worse. I use COUNT(*) and WHERE as you describe above to map this identifier to an offset, and then a second SELECT with LIMIT/OFFSET to describe the object and the those that follow on the page. According to your suggestion, I think this means I should track the identifier with the last offset, displaying the offset to the user for information purposes only, not using it for any queries, and then use WHERE and LIMIT? I tried this out. EXPLAIN ANALYZE tells me that for a random offset=200, limit=20 case I tried, the simple change changes it from index scanning 207 rows to find 7 rows, to index scanning 7 rows to find 7 rows. Sweet. Unfortunately, the time to complete is unchanged around 1.3+/-0.2 milliseconds. Looks like my system has bigger bottlenecks. :-) Thanks, mark -- [EMAIL PROTECTED] / [EMAIL PROTECTED] / [EMAIL PROTECTED] __ . . _ ._ . . .__. . ._. .__ . . . .__ | Neighbourhood Coder |\/| |_| |_| |/|_ |\/| | |_ | |/ |_ | | | | | | \ | \ |__ . | | .|. |__ |__ | \ |__ | Ottawa, Ontario, Canada One ring to rule them all, one ring to find them, one ring to bring them all and in the darkness bind them... http://mark.mielke.cc/ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster