Re: [PERFORM] Specific query performance problem help requested - postgresql 7.4

2005-06-02 Thread Brad Might
 
How is it that the index scan has such poor performance? Shouldn't index
lookups be quicker?
-Original Message-
From: Tom Lane [mailto:[EMAIL PROTECTED] 
Sent: Thursday, May 26, 2005 1:32 PM
To: Brad Might
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Specific query performance problem help requested
- postgresql 7.4 

Brad Might [EMAIL PROTECTED] writes:
 Can someone help me break this down and figure out why the one query 
 takes so much longer than the other?

It looks to me like there's a correlation between filename and bucket,
such that the indexscan in filename order takes much longer to run
across the first 25 rows with bucket = 3 than it does to run across the
first 25 with bucket = 7 or bucket = 8.  It's not just a matter of there
being fewer rows with bucket = 3 ... the cost differential is much
larger than is explained by the count ratios.  The bucket = 3 rows have
to be lurking further to the back of the filename order than the others.

 Here's the bucket distribution..i have clustered the index on the 
 bucket value.

If you have an index on bucket, it's not doing you any good here anyway,
since you wrote the constraint as a crosstype operator (3 is int4 not
int8).  It might help to explicitly cast the constant to int8.

regards, tom lane


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Specific query performance problem help requested - postgresql 7.4

2005-05-26 Thread Tom Lane
Brad Might [EMAIL PROTECTED] writes:
 Can someone help me break this down and figure out why the one query
 takes so much longer than the other?

It looks to me like there's a correlation between filename and bucket,
such that the indexscan in filename order takes much longer to run
across the first 25 rows with bucket = 3 than it does to run across
the first 25 with bucket = 7 or bucket = 8.  It's not just a matter of
there being fewer rows with bucket = 3 ... the cost differential is much
larger than is explained by the count ratios.  The bucket = 3 rows have
to be lurking further to the back of the filename order than the others.

 Here's the bucket distribution..i have clustered the index on the bucket
 value.

If you have an index on bucket, it's not doing you any good here anyway,
since you wrote the constraint as a crosstype operator (3 is int4 not
int8).  It might help to explicitly cast the constant to int8.

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Specific query performance problem help requested - postgresql 7.4

2005-05-26 Thread Brad Might
Looks like I modified that constraint since the original has '3' and
explaining that shows the one I ended up running and posting has 3. 
Whn I explain on the original version it shows filter: (bucket =
3::bigint)

Can you elaborate on what you mean by:
 The bucket = 3 rows have to be lurking further to the back of the
filename order than the others

How does this apply to the index on filename?

It is possible that the data values are skewed, is there any way I can
gracefully handle this condition?
This query is being used to extract data for interactive display and the
time for bucket 3 is so out of 
whack with all the others (I've run this across all buckets and only
bucket 3 has the horrendous excecution
times)


Any suggestions for working around this problem to speed up execution?


Thanks for the help

brad


-Original Message-
From: Tom Lane [mailto:[EMAIL PROTECTED] 
Sent: Thursday, May 26, 2005 1:32 PM
To: Brad Might
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Specific query performance problem help requested
- postgresql 7.4 

Brad Might [EMAIL PROTECTED] writes:
 Can someone help me break this down and figure out why the one query 
 takes so much longer than the other?

It looks to me like there's a correlation between filename and bucket,
such that the indexscan in filename order takes much longer to run
across the first 25 rows with bucket = 3 than it does to run across the
first 25 with bucket = 7 or bucket = 8.  It's not just a matter of there
being fewer rows with bucket = 3 ... the cost differential is much
larger than is explained by the count ratios.  The bucket = 3 rows have
to be lurking further to the back of the filename order than the others.

 Here's the bucket distribution..i have clustered the index on the 
 bucket value.

If you have an index on bucket, it's not doing you any good here anyway,
since you wrote the constraint as a crosstype operator (3 is int4 not
int8).  It might help to explicitly cast the constant to int8.

regards, tom lane


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org