Re: [PERFORM] Order by (for 15 rows) adds 30 seconds to query time

2009-12-02 Thread Matthew Wakeling
On Tue, 1 Dec 2009, Jean-Michel Pouré wrote: PostgreSQL query analyzer needs to run a couple of times before it can rewrite and optimize the query. Make sure demand_id, id and join IDs carry indexes. Huh? At what point does the planner carry over previous plans and use them to further optimise

Re: [PERFORM] Cost of sort/order by not estimated by the query planner

2009-12-02 Thread Laurent Laborde
hummm Adding pgsql-perf :) On Mon, Nov 30, 2009 at 5:54 PM, Laurent Laborde wrote: > Friendly greetings ! > I use postgresql 8.3.6. > > here is a few info about the table i'm querying : > - > - select count(*) from _article : 1730161

Re: [PERFORM] Cost of sort/order by not estimated by the query planner

2009-12-02 Thread Greg Stark
On Wed, Dec 2, 2009 at 11:13 AM, Laurent Laborde wrote: >>                                             QUERY PLAN >> - >>  Limit  (cost=66114.13..66115.38 rows=500 width=1114) >>   ->  Sort  (cost=6

Re: [PERFORM] Cost of sort/order by not estimated by the query planner

2009-12-02 Thread Laurent Laborde
On Wed, Dec 2, 2009 at 1:42 PM, Greg Stark wrote: > On Wed, Dec 2, 2009 at 11:13 AM, Laurent Laborde wrote: >>>                                             QUERY PLAN >>> - >>>  Limit  (cost=66114.

Re: [PERFORM] Cost of sort/order by not estimated by the query planner

2009-12-02 Thread Greg Stark
On Wed, Dec 2, 2009 at 11:13 AM, Laurent Laborde wrote: >>                                           QUERY PLAN >> - >>  Limit  (cost=0.00..2042.87 rows=5 width=1114) >>   ->  Index Scan using _article_

Re: [PERFORM] Cost of sort/order by not estimated by the query planner

2009-12-02 Thread Laurent Laborde
On Wed, Dec 2, 2009 at 1:47 PM, Greg Stark wrote: > On Wed, Dec 2, 2009 at 11:13 AM, Laurent Laborde wrote: >>>                                           QUERY PLAN >>> - >>>  Limit  (cost=0.00..2042.8

Re: [PERFORM] Cost of sort/order by not estimated by the query planner

2009-12-02 Thread Robert Haas
On Wed, Dec 2, 2009 at 8:01 AM, Laurent Laborde wrote: > On Wed, Dec 2, 2009 at 1:47 PM, Greg Stark wrote: >> On Wed, Dec 2, 2009 at 11:13 AM, Laurent Laborde wrote:                                           QUERY PLAN ---

Re: [PERFORM] Cost of sort/order by not estimated by the query planner

2009-12-02 Thread Laurent Laborde
On Wed, Dec 2, 2009 at 2:17 PM, Robert Haas wrote: > > I'm confused.  I think you've only shown us two query plans, so it's > hard to judge what's going on here in the two cases you haven't shown. >  Also, you haven't shown the EXPLAIN ANALYZE output, so it's a bit > tricky to judge what is really

Re: [PERFORM] Order by (for 15 rows) adds 30 seconds to query time

2009-12-02 Thread Craig Ringer
On 2/12/2009 7:08 PM, Matthew Wakeling wrote: On Tue, 1 Dec 2009, Jean-Michel Pouré wrote: PostgreSQL query analyzer needs to run a couple of times before it can rewrite and optimize the query. Make sure demand_id, id and join IDs carry indexes. Huh? At what point does the planner carry over p

Re: [PERFORM] Cost of sort/order by not estimated by the query planner

2009-12-02 Thread Laurent Laborde
* without order by, limit 5 : 70ms -- explain analyze SELECT * FROM _article WHERE (_article.bitfield && getbit(0)) LIMIT 5; QUERY PLAN : Limit (cost=0.00..20.03 rows=5 width=1109) (actual time=70.190..70.265 rows=5 loops=1) -> Index Scan using idx_article_

Re: [PERFORM] Cost of sort/order by not estimated by the query planner

2009-12-02 Thread Robert Haas
On Wed, Dec 2, 2009 at 10:32 AM, Laurent Laborde wrote: > * without order by, limit 5 : 70ms > -- >  explain analyze SELECT * > FROM   _article > WHERE (_article.bitfield && getbit(0)) > LIMIT 5; > > QUERY PLAN  : > Limit  (cost=0.00..20.03 rows=5 width=1109) (actua

Re: [PERFORM] Cost of sort/order by not estimated by the query planner

2009-12-02 Thread Tom Lane
Robert Haas writes: > The exact break-even point between the two plans will vary depending > on what percentage of the rows in the table satisfy the bitmap > condition. It's worse than that. The planner is not too bad about understanding the percentage-of-rows problem --- at least, assuming you

Re: [PERFORM] Order by (for 15 rows) adds 30 seconds to query time

2009-12-02 Thread Kevin Grittner
Craig Ringer wrote: > Some of those tables are views composed of multiple unions, too, > by the looks of things. > > Doesn't the planner have some ... issues ... with estimation of > row counts on joins over unions? Or is my memory just more faulty > than usual? So far I can't tell if it's vi

Re: [PERFORM] Order by (for 15 rows) adds 30 seconds to query time

2009-12-02 Thread Tom Lane
"Kevin Grittner" writes: > Craig Ringer wrote: >> Doesn't the planner have some ... issues ... with estimation of >> row counts on joins over unions? Or is my memory just more faulty >> than usual? > So far I can't tell if it's views with unions or (as I suspect) > inheritance. As of recent ve

Re: [PERFORM] Order by (for 15 rows) adds 30 seconds to query time

2009-12-02 Thread Kevin Grittner
Tom Lane wrote: > And yeah, I think the statistical support is pretty crummy. Do you know, off-hand, why the estimated row count for a "Nested Loop Left Join" is not the product of the estimates for the two sides? (I fear I'm missing something important which lead to the current estimates.)

Re: [PERFORM] Order by (for 15 rows) adds 30 seconds to query time

2009-12-02 Thread Tom Lane
"Kevin Grittner" writes: > Estimates extracted from the problem plan: > Nested Loop Left Join (rows=806903677108) > -> Nested Loop Left Join (rows=203176856) > -> Nested Loop Left Join (rows=51160) > -> Nested Loop Left Join (rows=28) > ->

Re: [PERFORM] Order by (for 15 rows) adds 30 seconds to query time

2009-12-02 Thread Kevin Grittner
Tom Lane wrote: > That does look weird. Do we have a self-contained test case? Richard, could you capture the schema for the affected tables and views with pg_dump -s and also the related rows from pg_statistic? (The actual table contents aren't needed to see this issue.) -Kevin -- Sent v

Re: [PERFORM] Order by (for 15 rows) adds 30 seconds to query time

2009-12-02 Thread Richard Neill
Kevin Grittner wrote: Tom Lane wrote: That does look weird. Do we have a self-contained test case? Not at the moment. It seems to only occur with relatively complex joins. Richard, could you capture the schema for the affected tables and views with pg_dump -s and also the related row

Re: [PERFORM] Order by (for 15 rows) adds 30 seconds to query time

2009-12-02 Thread Kevin Grittner
Richard Neill wrote: > Regarding pg_statistic, I don't understand how to find the > relevant rows - what am I looking for? (the pg_statistic table is > 247M in size). I think the only relevant rows would be the ones with starelid = pg_class.oid for a table used in the query, and I think you c

Re: [PERFORM] Order by (for 15 rows) adds 30 seconds to query time

2009-12-02 Thread Richard Neill
Kevin Grittner wrote: Richard Neill wrote: Regarding pg_statistic, I don't understand how to find the relevant rows - what am I looking for? (the pg_statistic table is 247M in size). I think the only relevant rows would be the ones with starelid = pg_class.oid for a table used in the qu

Re: [PERFORM] Query times change by orders of magnitude as DB ages

2009-12-02 Thread Scott Carey
On 11/25/09 4:18 AM, "Matthew Wakeling" wrote: > > The problem is that vacuum full does a full compact of the table, but it > has to update all the indexes as it goes. This makes it slow, and causes > bloat to the indexes. There has been some discussion of removing the > command or at least putt

Re: [PERFORM] Analyse without locking?

2009-12-02 Thread Richard Neill
Dear All, I'm still puzzled by this one - it looks like it's causing about 5% of queries to rise in duration from ~300ms to 2-6 seconds. On the other hand, the system never seems to be I/O bound. (we have at least 25 MB/sec of write bandwidth, and use a small fraction of that normally). He

Re: [PERFORM] Checkpoint spikes

2009-12-02 Thread Greg Smith
Richard Neill wrote: On the other hand, the system never seems to be I/O bound. (we have at least 25 MB/sec of write bandwidth, and use a small fraction of that normally). I would bet that if you sampled vmstat or iostat every single second, you'd discover there's a large burst in write speed f