Thanks for the rapid response. I can understand the way the planner makes its guess, but as a matter of fact, he'll be nearly always wrong, just becausethe most commented articles have only around 5000 or so comments. I ran the explain analyze tonight and got this results :
EXPLAIN ANALYZE SELECT _comment.id, > (get_comment_response(_comment.id)).id AS r_id > FROM _comment > INNER JOIN _article > ON _article.id = _comment.parent_id > WHERE _comment.parent_id = '17355952' > ORDER BY _comment.id ASC > OFFSET 0 > LIMIT 1; > > QUERY PLAN > > --------------------------------------------------------------------------------------------------------------------------------------------------------- > Limit (cost=0.00..3588.42 rows=1 width=8) (actual > time=498597.115..498597.116 rows=1 loops=1) > -> Nested Loop (cost=0.00..28524312.40 rows=7949 width=8) (actual > time=498597.114..498597.114 rows=1 loops=1) > -> Index Scan using _comment_pkey on _comment > (cost=0.00..28448324.73 rows=7949 width=16) (actual > time=498473.360..498473.360 rows=1 loops=1) > Filter: (parent_id = 17355952::bigint) > -> Index Scan using _article_pkey on _article (cost=0.00..9.55 > rows=1 width=8) (actual time=63.465..63.465 rows=1 loops=1) > Index Cond: (_article.id = 17355952::bigint) > Total runtime: 498615.230 ms > (7 rows) > > EXPLAIN ANALYZE SELECT _comment.id, > (get_comment_response(_comment.id)).id AS r_id > FROM _comment > INNER JOIN _article > ON _article.id = _comment.parent_id > WHERE _comment.parent_id = '17355952' > ORDER BY _comment.id ASC > OFFSET 0 > LIMIT 1000; > > QUERY PLAN > > ------------------------------------------------------------------------------------------------------------------------------------------------------------- > Limit (cost=10261.19..10263.69 rows=1000 width=8) (actual > time=127.037..127.267 rows=1000 loops=1) > -> Sort (cost=10261.19..10281.06 rows=7949 width=8) (actual > time=127.036..127.128 rows=1000 loops=1) > Sort Key: _comment.id > Sort Method: top-N heapsort Memory: 95kB > -> Nested Loop (cost=0.00..9825.35 rows=7949 width=8) (actual > time=0.472..122.986 rows=4674 loops=1) > -> Index Scan using _article_pkey on _article > (cost=0.00..9.55 rows=1 width=8) (actual time=0.011..0.013 rows=1 loops=1) > Index Cond: (id = 17355952::bigint) > -> Index Scan using idx_comment_parent_id on _comment > (cost=0.00..9716.44 rows=7949 width=16) (actual time=0.235..32.869 rows=4674 > loops=1) > Index Cond: (_comment.parent_id = 17355952::bigint) > Total runtime: 127.410 ms > (10 rows) > As you can see, the time is dramaticaly longuer with the LIMIT 1 (or in our case, LIMIT 2). Yannick. 2009/1/20 David Wilson <david.t.wil...@gmail.com> > On Tue, Jan 20, 2009 at 10:45 AM, Yannick Le Guédart > <yann...@over-blog.com> wrote: > > > > > The second query scans the whole comment table which is very dangerous > for > > production servers. > > That's not quite true. The second does an index scan- the planner > seems to be guessing that it'll fulfill the required limit early in > the index scan; only with a pathologically bad case would it actually > have to scan the entire thing. Basically, the second query is > optimized to spit out the first few rows quickly, since that's all you > asked for with the limit. > > Note that your first query has a final cost estimate of "Limit > (cost=10261.19..10263.69 rows=1000 width=8)", indicating an estimated > 10261.19 to emit the first row; the second has "Limit > (cost=0.00..3588.42 rows=1 width=8)" estimating 0.00 (basically, > instant) to emit the first - and only desired - row. > > That all said, an explain analyze would give us a better idea of > what's going on- we can't tell if the planner is making bad estimates > without the knowledge of what the real timing and row count results of > plan stages were. > > > -- > - David T. Wilson > david.t.wil...@gmail.com >