Re: [PERFORM] Query using SeqScan instead of IndexScan

2006-04-04 Thread Jim Nasby
On Apr 1, 2006, at 12:51 PM, Brendan Duddridge wrote: from SELECT * FROM pg_stats WHERE tablename='table' AND attname='category_id' I find correlation on category_product for category_id is 0.643703 Would setting the index on category_id to be clustered help with this? It would absolutely h

Re: [PERFORM] Query using SeqScan instead of IndexScan

2006-04-04 Thread Jim Nasby
On Apr 2, 2006, at 6:30 PM, Josh Berkus wrote: But just as a follow up question to your #1 suggestion, I have 8 GB of ram in my production server. You're saying to set the effective_cache_size then to 5 GB roughly? Somewhere around 655360? Currently it is set to 65535. Is that something that's OS

Re: [PERFORM] Query using SeqScan instead of IndexScan

2006-04-02 Thread Brendan Duddridge
Hi Josh, Thanks. I've adjusted my effective_cache_size to 5 GB, so we'll see how that goes. I'm also doing some query and de-normalization optimizations so we'll see how those go too. Brendan Duddridge | CTO | 403-277-55

Re: [PERFORM] Query using SeqScan instead of IndexScan

2006-04-02 Thread Josh Berkus
Brendan, > But just as a follow up question to your #1 suggestion, I have 8 GB > of ram in my production server. You're saying to set the > effective_cache_size then to 5 GB roughly? Somewhere around 655360? > Currently it is set to 65535. Is that something that's OS dependent? > I'm not sure how

Re: [PERFORM] Query using SeqScan instead of IndexScan

2006-04-01 Thread Tom Lane
Mark Kirkwood <[EMAIL PROTECTED]> writes: > It is a bit confusing - '(distinct) cardinality' might be a better > heading for their 'cardinality' column! The usual mathematical meaning of "cardinality" is "the number of members in a set". That isn't real helpful for the point at hand, because the

Re: [PERFORM] Query using SeqScan instead of IndexScan

2006-04-01 Thread Mark Kirkwood
chris smith wrote: It'd be nice if the database developers agreed on what terms meant. http://dev.mysql.com/doc/refman/5.1/en/myisam-index-statistics.html The SHOW INDEX statement displays a cardinality value based on N/S, where N is the number of rows in the table and S is the average value

Re: [PERFORM] Query using SeqScan instead of IndexScan

2006-04-01 Thread Alvaro Herrera
chris smith wrote: > I believe postgres (because it's a lot more standards compliant).. but > sheesh - what a difference! > > This week's task - stop reading mysql documentation. You don't _have_ to believe Postgres -- this is stuff taught in any statistics course. -- Alvaro Herrera

Re: [PERFORM] Query using SeqScan instead of IndexScan

2006-04-01 Thread chris smith
On 4/2/06, chris smith <[EMAIL PROTECTED]> wrote: > On 4/2/06, Jim C. Nasby <[EMAIL PROTECTED]> wrote: > > On Sat, Apr 01, 2006 at 11:23:37AM +1000, chris smith wrote: > > > On 4/1/06, Brendan Duddridge <[EMAIL PROTECTED]> wrote: > > > > Hi Jim, > > > > > > > > I'm not quite sure what you mean by t

Re: [PERFORM] Query using SeqScan instead of IndexScan

2006-04-01 Thread chris smith
On 4/2/06, Jim C. Nasby <[EMAIL PROTECTED]> wrote: > On Sat, Apr 01, 2006 at 11:23:37AM +1000, chris smith wrote: > > On 4/1/06, Brendan Duddridge <[EMAIL PROTECTED]> wrote: > > > Hi Jim, > > > > > > I'm not quite sure what you mean by the correlation of category_id? > > > > It means how many disti

Re: [PERFORM] Query using SeqScan instead of IndexScan

2006-04-01 Thread Brendan Duddridge
Hi Jim, from SELECT * FROM pg_stats WHERE tablename='table' AND attname='category_id' I find correlation on category_product for category_id is 0.643703 Would setting the index on category_id to be clustered help with this? Thanks, __

Re: [PERFORM] Query using SeqScan instead of IndexScan

2006-04-01 Thread Jim C. Nasby
On Sat, Apr 01, 2006 at 11:23:37AM +1000, chris smith wrote: > On 4/1/06, Brendan Duddridge <[EMAIL PROTECTED]> wrote: > > Hi Jim, > > > > I'm not quite sure what you mean by the correlation of category_id? > > It means how many distinct values does it have (at least that's my > understanding of i

Re: [PERFORM] Query using SeqScan instead of IndexScan

2006-03-31 Thread Brendan Duddridge
Ah I see. Ok, well we have a very wide variety here... category_id | count -+--- 1000521 | 31145 1001211 | 22991 1001490 | 22019 1001628 | 12472 146 | 10480 187 | 10338 1001223 | 10020 1001560 | 9532 1000954 | 8633 100131

Re: [PERFORM] Query using SeqScan instead of IndexScan

2006-03-31 Thread chris smith
On 4/1/06, Brendan Duddridge <[EMAIL PROTECTED]> wrote: > Hi Jim, > > I'm not quite sure what you mean by the correlation of category_id? It means how many distinct values does it have (at least that's my understanding of it ;) ). select category_id, count(*) from category_product group by catego

Re: [PERFORM] Query using SeqScan instead of IndexScan

2006-03-31 Thread Brendan Duddridge
Hi Jim, I'm not quite sure what you mean by the correlation of category_id? The category_id is part of a compound primary key in the category_product table. The primary key on category_product is (category_id, product_id). Here's the definitions of the two tables involved in the join:

Re: [PERFORM] Query using SeqScan instead of IndexScan

2006-03-31 Thread Jim C. Nasby
What's the correlation of category_id? The current index scan cost estimator places a heavy penalty on anything with a correlation much below about 90%. On Wed, Mar 29, 2006 at 08:12:28PM -0700, Brendan Duddridge wrote: > Hi, > > I have a query that is using a sequential scan instead of an index

Re: [PERFORM] Query using SeqScan instead of IndexScan

2006-03-29 Thread Josh Berkus
Brenden, > Any ideas what I can do to improve this without turning sequential   > scanning off? Hmmm, looks like your row estimates are good. Which means it's probably your postgresql.conf parameters which are off. Try the following, in the order below: 1) Raise effective_cache_size to 2/3 o

Re: [PERFORM] Query using SeqScan instead of IndexScan

2006-03-29 Thread Brendan Duddridge
Oops. I forgot to mention that I was using PostgreSQL 8.1.3 on Mac OS X.Thanks, Brendan Duddridge | CTO | 403-277-5591 x24 |  [EMAIL PROTECTED] ClickSpace Interactive Inc. Suite L100, 239 - 10th Ave. SE Calgary, AB  T2G 0V9 http

[PERFORM] Query using SeqScan instead of IndexScan

2006-03-29 Thread Brendan Duddridge
Hi,I have a query that is using a sequential scan instead of an index scan. I've turned off sequential scans and it is in fact faster with the index scan.Here's my before and after.Before:ssdev=# SET enable_seqscan TO DEFAULT;ssdev=# explain analyze select cp.product_id from category_product cp, p