Re: [PERFORM] Sub-optimal plan chosen

2009-09-10 Thread tv
 default_statistics_target = 100 (tried with 500, no change). Vacuum
 analyzed
 before initial query, and after each change to default_statistics_target.

Modifying the statistics target is useful only if the estimates are
seriously off, which is not your case - so it won't help, at least not
reliably.

 The same query, with a different ofid, will occasionally get the more
 optimal plan -- I assume that the distribution of data is the
 differentiator
 there.

Yes, the difference between costs of the two plans is quite small (11796
vs. 13153) so it's very sensible to data distribution.

 Is there any other data I can provide to shed some light on this?

You may try to play with the 'cost' constants - see this:

http://www.postgresql.org/docs/8.4/interactive/runtime-config-query.html#RUNTIME-CONFIG-QUERY-CONSTANTS

You just need to modify them so that the bitmap index scan / bitmap heap
scan is prefered to plain index scan.

Just be careful - if set in the postgresql.conf, it affects all the
queries and may cause serious problems with other queries. So it deserves
proper testing ...

regards
Tomas


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Sub-optimal plan chosen

2009-09-10 Thread bricklen
Hi Tomas,

2009/9/10 t...@fuzzy.cz

  default_statistics_target = 100 (tried with 500, no change). Vacuum
  analyzed
  before initial query, and after each change to default_statistics_target.

 Modifying the statistics target is useful only if the estimates are
 seriously off, which is not your case - so it won't help, at least not
 reliably.

  The same query, with a different ofid, will occasionally get the more
  optimal plan -- I assume that the distribution of data is the
  differentiator
  there.

 Yes, the difference between costs of the two plans is quite small (11796
 vs. 13153) so it's very sensible to data distribution.

  Is there any other data I can provide to shed some light on this?

 You may try to play with the 'cost' constants - see this:


 http://www.postgresql.org/docs/8.4/interactive/runtime-config-query.html#RUNTIME-CONFIG-QUERY-CONSTANTS

 You just need to modify them so that the bitmap index scan / bitmap heap
 scan is prefered to plain index scan.

 Just be careful - if set in the postgresql.conf, it affects all the
 queries and may cause serious problems with other queries. So it deserves
 proper testing ...

 regards
 Tomas



Playing around with seq_page_cost (1) and random_page_cost (1), I can get
the correct index selected. Applying those same settings to our production
server does not produce the optimal plan, though.


Re: [PERFORM] Sub-optimal plan chosen

2009-09-10 Thread tv
 Playing around with seq_page_cost (1) and random_page_cost (1), I can get
 the correct index selected. Applying those same settings to our production
 server does not produce the optimal plan, though.

I doubt setting seq_page_cost and random_page_cost to the same value is
reasonable - random access is almost always more expensive than sequential
access.

Anyway, post the EXPLAIN ANALYZE output from the production server. Don't
forget there are other _cost values - try to modify them too, but I'm not
sure how these values relate to the bitmap heap scan / bitmap index plans.

regards
Tomas


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Sub-optimal plan chosen

2009-09-10 Thread Tom Lane
bricklen brick...@gmail.com writes:
 Is there any other data I can provide to shed some light on this?

The table and index definitions?

The straight indexscan would probably win if the index column order
were ofid, date instead of date, ofid.  I can't tell if you have
any other queries for which the existing column order is preferable,
though.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Sub-optimal plan chosen

2009-09-10 Thread bricklen
On Thu, Sep 10, 2009 at 8:43 AM, Tom Lane t...@sss.pgh.pa.us wrote:

 bricklen brick...@gmail.com writes:
  Is there any other data I can provide to shed some light on this?

 The table and index definitions?

 The straight indexscan would probably win if the index column order
 were ofid, date instead of date, ofid.  I can't tell if you have
 any other queries for which the existing column order is preferable,
 though.

regards, tom lane



Changing the order of the WHERE predicates didn't help. The indexes are
mostly defined as single-column indexes, with the exception of the
statsdaily_unique_idx one:

statsdaily_id_pk PRIMARY KEY, btree (id)
statsdaily_unique_idx UNIQUE, btree (date, idaf, idsite, ofid, idcreative,
idoptimizer)
statsdaily_date_idx btree (date)
statsdaily_ofid_idx btree (ofid)
statsdaily_ofid_sept2009_idx btree (ofid) WHERE date = '2009-09-01'::date


Re: [PERFORM] Sub-optimal plan chosen

2009-09-10 Thread Robert Haas
2009/9/10  t...@fuzzy.cz:
 Playing around with seq_page_cost (1) and random_page_cost (1), I can get
 the correct index selected. Applying those same settings to our production
 server does not produce the optimal plan, though.

 I doubt setting seq_page_cost and random_page_cost to the same value is
 reasonable - random access is almost always more expensive than sequential
 access.

If the data figures to be read from the OS cache, it's very
reasonable, and the right value is somewhere in the 0.05 - 0.10 range.

...Robert

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Sub-optimal plan chosen

2009-09-10 Thread bricklen
On Thu, Sep 10, 2009 at 9:57 AM, Robert Haas robertmh...@gmail.com wrote:

 2009/9/10  t...@fuzzy.cz:
  Playing around with seq_page_cost (1) and random_page_cost (1), I can
 get
  the correct index selected. Applying those same settings to our
 production
  server does not produce the optimal plan, though.
 
  I doubt setting seq_page_cost and random_page_cost to the same value is
  reasonable - random access is almost always more expensive than
 sequential
  access.

 If the data figures to be read from the OS cache, it's very
 reasonable, and the right value is somewhere in the 0.05 - 0.10 range.


For the most part, it will indeed be cached. Thanks for the tip on the
values.


Re: [PERFORM] Sub-optimal plan chosen

2009-09-10 Thread Robert Haas
On Thu, Sep 10, 2009 at 12:56 PM, bricklen brick...@gmail.com wrote:
 On Thu, Sep 10, 2009 at 8:43 AM, Tom Lane t...@sss.pgh.pa.us wrote:

 bricklen brick...@gmail.com writes:
  Is there any other data I can provide to shed some light on this?

 The table and index definitions?

 The straight indexscan would probably win if the index column order
 were ofid, date instead of date, ofid.  I can't tell if you have
 any other queries for which the existing column order is preferable,
 though.

                        regards, tom lane


 Changing the order of the WHERE predicates didn't help.

He's talking about the index definition, not the WHERE clause.  The
order of the WHERE clause is totally irrelevant.

...Robert

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Sub-optimal plan chosen

2009-09-10 Thread bricklen
On Thu, Sep 10, 2009 at 10:02 AM, Robert Haas robertmh...@gmail.com wrote:

 On Thu, Sep 10, 2009 at 12:56 PM, bricklen brick...@gmail.com wrote:
  On Thu, Sep 10, 2009 at 8:43 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 
  bricklen brick...@gmail.com writes:
   Is there any other data I can provide to shed some light on this?
 
  The table and index definitions?
 
  The straight indexscan would probably win if the index column order
  were ofid, date instead of date, ofid.  I can't tell if you have
  any other queries for which the existing column order is preferable,
  though.
 
 regards, tom lane
 
 
  Changing the order of the WHERE predicates didn't help.

 He's talking about the index definition, not the WHERE clause.  The
 order of the WHERE clause is totally irrelevant.


Ah, sorry, missed that.


Re: [PERFORM] Sub-optimal plan chosen

2009-09-10 Thread bricklen
On Thu, Sep 10, 2009 at 10:07 AM, bricklen brick...@gmail.com wrote:

 On Thu, Sep 10, 2009 at 10:02 AM, Robert Haas robertmh...@gmail.comwrote:

 On Thu, Sep 10, 2009 at 12:56 PM, bricklen brick...@gmail.com wrote:
  On Thu, Sep 10, 2009 at 8:43 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 
  bricklen brick...@gmail.com writes:
   Is there any other data I can provide to shed some light on this?
 
  The table and index definitions?
 
  The straight indexscan would probably win if the index column order
  were ofid, date instead of date, ofid.  I can't tell if you have
  any other queries for which the existing column order is preferable,
  though.
 
 regards, tom lane
 
 
  Changing the order of the WHERE predicates didn't help.

 He's talking about the index definition, not the WHERE clause.  The
 order of the WHERE clause is totally irrelevant.


 Ah, sorry, missed that.



I just created a new index as Tom said, and the query *does* use the new
index (where ofid precedes date in the definition).


Re: [PERFORM] Sub-optimal plan chosen

2009-09-10 Thread Tom Lane
bricklen brick...@gmail.com writes:
 I just created a new index as Tom said, and the query *does* use the new
 index (where ofid precedes date in the definition).

And is it indeed faster than the other alternatives?

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Sub-optimal plan chosen

2009-09-10 Thread bricklen
On Thu, Sep 10, 2009 at 10:56 AM, Tom Lane t...@sss.pgh.pa.us wrote:

 bricklen brick...@gmail.com writes:
  I just created a new index as Tom said, and the query *does* use the new
  index (where ofid precedes date in the definition).

 And is it indeed faster than the other alternatives?

regards, tom lane


About the same as the earlier, faster plan:

 Aggregate  (cost=2342.79..2342.80 rows=1 width=8) (actual time=8.433..8.433
rows=1 loops=1)
   -  Index Scan using statsdaily_ofid_date on statsdaily
(cost=0.00..2330.61 rows=4873 width=8) (actual time=0.089..5.043 rows=3125
loops=1)
 Index Cond: ((ofid = 38) AND (date = '2009-09-01'::date) AND (date
= '2999-01-01'::date))
 Total runtime: 8.470 ms