OK, thanks a lot for your explanations. Knowing how the planner "thinks", makes it pretty logical. Thank you.
Now another question...
I have a table of records representing forum posts with a primary key (id), a topic_id, a timestamp, and other fields which I won't detail. I want to split them into pages (like forums usually do), with N posts per page. In that case :
SELECT * FROM table WHERE topic_id=... ORDER BY post_timestamp asc LIMIT N OFFSET N*page;
Also it's almost the same to order by id rather than post_timestamp (id being a serial).
SELECT * FROM table WHERE topic_id=... ORDER BY id asc LIMIT N OFFSET N*page;
This query runs slower and slower as the OFFSET grows, which is a problem because the most accessed page in a forum is the last one.
So, for the last page, I tried : SELECT * FROM table WHERE topic_id=... ORDER BY id desc LIMIT N; But this does not use the index at all (seq scan + sort + limit).
My solution is simple : build an index on (-id), or on (some date)-post_timestamp, then :
SELECT * FROM table WHERE topic_id=... ORDER BY (-id) desc LIMIT N;
Then the last page is the fastest one, but it always shows N posts. That's not a problem, so I guess I'll use that. I don't like forums which show 1 post on the last page because the number of posts modulo N is 1.
I may store the number of posts in a forum (updated by a trigger) to avoid costly COUNT queries to count the pages, so I could use ORDER BY id for the first half of the pages, and ORDER BY (-id) for the rest, so it will always be fastest.
I could even create a pages table to store the id of the first post on that page and then :
SELECT * FROM table WHERE topic_id=... AND id>id_of_first_post_in_page ORDER BY id asc LIMIT N;
then all pages would be aqually fast.
Or, I could cache the query results for all pages but the last one.
Finally, the question : having a multiple field btree, it is not harder to scan it in "desc order" than in "asc order". So why does not Postgres do it ? Here is a btree example :
topic_id id 1 1 1 10 2 2 2 5 2 17 3 4 3 6
suppose I SELECT WHERE topic_id=2 ORDER BY topic_id ASC,id ASC.
Postgres simply finds the first row with topic_id=2 and goes from there.
suppose I SELECT WHERE topic_id=2 ORDER BY topic_id ASC,id DESC.
Postgres does a seq scan, but it could think a bit more and start at "first index node which has topic_id>2" (simple to find in a btree) then go backwards in the index. This can ge beneralized to any combination of (asc,desc).
I made some more experiments, and saw Postgres does an 'Index Scan' when ORDER BY clauses are all ASC, and an 'Index Scan Backwards' when all ORDER BY are DESC. However, it does not handle a combination of ASC and DESC?
What do you think of this ?
On Mon, 06 Sep 2004 12:40:41 -0400, Tom Lane <[EMAIL PROTECTED]> wrote:
=?iso-8859-15?Q?Pierre-Fr=E9d=E9ric_Caillaud?= <[EMAIL PROTECTED]> writes:Now, if I LIMIT the query to 10 rows, the index should be used all the time, because it will always return few rows... well, it doesn't !
Not at all. From the planner's point of view, the LIMIT is going to reduce the cost by about a factor of 10/1403, since the underlying plan step will only be run partway through. That's not going to change the decision about which underlying plan step is cheapest: 10/1403 of a cheaper plan is still always less than 10/1403 of a more expensive plan.
Later, you note that LIMIT with ORDER BY does affect the plan choice --- that's because in that situation one plan alternative has a much higher startup cost than the other (namely the cost of a sort step). A small LIMIT can allow the fast-startup plan to be chosen even though it would be estimated to be the loser if run to completion.
regards, tom lane
---------------------------(end of broadcast)--------------------------- TIP 6: Have you searched our list archives?
---------------------------(end of broadcast)--------------------------- TIP 4: Don't 'kill -9' the postmaster