Re: [PERFORM] index v. seqscan for certain values

2004-04-15 Thread Manfred Koizar
On Tue, 13 Apr 2004 13:55:49 -0400, Tom Lane [EMAIL PROTECTED] wrote:
Possibly the
nonuniform clumping of CID has something to do with the poor results.

It shouldn't.  The sampling algorithm is designed to give each tuple the
same chance of ending up in the sample, and tuples are selected
independently.  (IOW each one of the {N \chooose n} possible samples has
the same probability.)   There are known problems with nonuniform
distribution of dead vs. live and large vs. small tuples, but AFAICS the
order of values does not matter.

Servus
 Manfred

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] index v. seqscan for certain values

2004-04-13 Thread Jeremy Dunn
  When I just tried it again with a value of 300, analyze, 
 then run the query, I get a *worse* result for an estimate.  I don't
understand 
  this.
 
 That's annoying.  How repeatable are these results --- if you 
 do ANALYZE over again several times, how much does the row 
 count estimate change each time?  (It should change somewhat, 
 since ANALYZE is taking a random sample, but one would like 
 to think not a whole lot.)  Is the variance more or less at 
 the higher stats target?  Take a look at a few different CID 
 values to get a sense of the accuracy, don't look at just one ...

Yes, it's repeatable.  I tried a bunch of times, and there are only
small variations in the stats for the higher stat targets.

 (Actually, you might find it more profitable to look at the 
 pg_stats entry for the CID column rather than 
 reverse-engineering the stats via ANALYZE.  Look at how well 
 the most-common-values list and associated frequency numbers 
 track reality.)

I checked the accuracy of the stats for various values, and there is a
wide variation.  I see some values where the estimate is 1.75x the
actual; and others where the estimate is .44x the actual.

 Also, can you think of any reason for the distribution of CID 
 values to be nonuniform within the table?  For instance, do 
 rows get inserted in order of increasing CID, or is there any 
 clustering of rows with the same CID?

This is almost certainly the answer.  The data is initially inserted in
chunks for each CID, and later on there is a more normal distribution of
insert/update/deletes across all CIDs; and then again a new CID will
come with a large chunk of rows, etc.

Interestingly, I tried increasing the stat size for the CID column to
2000, analyzing, and checking the accuracy of the stats again.  Even
with this relatively high value, the accuracy of the stats is not that
close.   The value giving .44x previously nows gives an estimate .77x of
actual.  Another value which was at 1.38x of actual is now at .71x of
actual!  

Then just for kicks I set the statistics size to 100,000 (!), analyzed,
and ran the query again.  For the same CID I still got an estimated row
count that is .71x the actual rows returned.  Why is this not better?  I
wonder how high I'd have to set the statistics collector to get really
good data, given the uneven data distribution of this table.  Is there
any other technique that works better to get good estimates, given
uneven distribution of values?

So I think this explains the inaccurate stats; and the solution as far
as I'm concerned is to increase the two params mentioned yesterday
(effective_cache_size  random_page_cost).

Thanks again for the help!
- Jeremy


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] index v. seqscan for certain values

2004-04-13 Thread Tom Lane
Jeremy Dunn [EMAIL PROTECTED] writes:
 Interestingly, I tried increasing the stat size for the CID column to
 2000, analyzing, and checking the accuracy of the stats again.

There's a hard limit of 1000, I believe.  Didn't it give you a warning
saying so?

At 1000 the ANALYZE sample size would be 30 rows, or about a quarter
of your table.  I would have thought this would give frequency estimates
with much better precision than you seem to be seeing --- but my
statistics are rusty enough that I'm not sure about it.  Possibly the
nonuniform clumping of CID has something to do with the poor results.

Any stats majors on the list?

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] index v. seqscan for certain values

2004-04-13 Thread Jeremy Dunn

 There's a hard limit of 1000, I believe.  Didn't it give you
 a warning saying so?

No warning at 2000, and no warning at 100,000 either!

Remember we are still on 7.2.x.  The docs here
http://www.postgresql.org/docs/7.2/static/sql-altertable.html don't say
anything about a limit.  

This is good to know, if it's true.  Can anyone confirm?

- Jeremy


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] index v. seqscan for certain values

2004-04-12 Thread Bill Moran
Quick bit of input, since you didn't mention it.

How often do you run ANALYZE?  I found it interesting that a database I
was doing tests on sped up by a factor of 20 after ANALYZE.  If your
data changes a lot, you should probably schedule ANALYZE to run with
VACUUM.
Jeremy Dunn wrote:
I've searched the archives and can't find an answer to this seemingly 
simple question.  Apologies if it's too common.
 
The table in question has ~1.3M rows.  It has 85 columns, 5 of which 
have single-column indexes.
 
The column in question (CID) has 183 distinct values.  For these values, 
the largest has ~38,000 rows, and the smallest has 1 row.  About 30 
values have  100 rows, and about 10 values have  20,000 rows.
 
The database is 7.2.3 running on RedHat 7.1. (we are in process of 
upgrading to PG 7.4.2)All of the query plan options are enabled, and 
the cpu costs are set to the default values. ( cpu_tuple_cost is 0.01, 
cpu_index_tuple_cost is 0.001).  The database is VACUUM'd every night.
 
The problem:
A simply query:
select count(*) from xxx where CID=smalval
where smalval is a CID value which has relatively few rows, returns a 
plan using the index on that column.
 
   explain analyze select count(*) from xxx where cid=869366;
   Aggregate  (cost=19136.33..19136.33 rows=1 width=0) (actual 
time=78.49..78.49 rows=1 loops=1)
 -  Index Scan using xxx_cid on emailrcpts  (cost=0.00..19122.21 
rows=5648 width=0) (actual time=63.40..78.46 rows=1 loops=1)
   Total runtime: 78.69 msec
 
The same plan is true for values which have up to about 20,000 rows:
 
   explain analyze select count(*) from xxx where cid=6223341;
   Aggregate  (cost=74384.19..74384.19 rows=1 width=0) (actual 
time=11614.89..11614.89 rows=1 loops=1)
 -  Index Scan using xxx_cid on emailrcpts  (cost=0.00..74329.26 
rows=21974 width=0) (actual time=35.75..11582.10 rows=20114 loops=1)
   Total runtime: 11615.05 msec
However for the values that have  20,000 rows, the plan changes to a 
sequential scan, which is proportionately much slower.
 
   explain analyze select count(*) from xxx where cid=7191032;
   Aggregate  (cost=97357.61..97357.61 rows=1 width=0) (actual 
time=46427.81..46427.82 rows=1 loops=1)
-   Seq Scan on xxx (cost=0.00..97230.62 rows=50792 width=0) 
(actual time=9104.45..46370.27 rows=37765 loops=1)
Total runtime: 46428.00 msec
 
 
The question: why does the planner consider a sequential scan to be 
better for these top 10 values?  In terms of elapsed time it is more 
than twice as slow, proportionate to an index scan for the same number 
of rows.
 
What I tried:
 
A) alter table xxx alter column cid set statistics 500;   
analyze xxx;
This does not affect the results.
 
B)  dropped/rebuilt the index, with no improvement.
 
C) decreasing cpu_index_tuple_cost by a factor of up to 1000, with no 
success
 
D) force an index scan for the larger values by using a very high value 
for cpu_tuple_cost (e.g. .5) but this doesn't seem like a wise thing to do.
 
Your thoughts appreciated in advance!
 
- Jeremy 
 
7+ years experience in Oracle performance-tuning
relatively new to postgresql


--
Bill Moran
Potential Technologies
http://www.potentialtech.com
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] index v. seqscan for certain values

2004-04-12 Thread Jeremy Dunn
Sorry I should have written that we do VACUUM VERBOSE ANALYZE every
night.

- Jeremy

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Bill Moran
Sent: Monday, April 12, 2004 12:09 PM
To: [EMAIL PROTECTED]
Cc: Postgresql Performance
Subject: Re: [PERFORM] index v. seqscan for certain values


Quick bit of input, since you didn't mention it.

How often do you run ANALYZE?  I found it interesting that a database I
was doing tests on sped up by a factor of 20 after ANALYZE.  If your
data changes a lot, you should probably schedule ANALYZE to run with
VACUUM.

Jeremy Dunn wrote:
 I've searched the archives and can't find an answer to this seemingly
 simple question.  Apologies if it's too common.
  
 The table in question has ~1.3M rows.  It has 85 columns, 5 of which
 have single-column indexes.
  
 The column in question (CID) has 183 distinct values.  For these 
 values,
 the largest has ~38,000 rows, and the smallest has 1 row.  About 30 
 values have  100 rows, and about 10 values have  20,000 rows.
  
 The database is 7.2.3 running on RedHat 7.1. (we are in process of 
 upgrading to PG 7.4.2)All of the query plan options are enabled,
and 
 the cpu costs are set to the default values. ( cpu_tuple_cost is 0.01,
 cpu_index_tuple_cost is 0.001).  The database is VACUUM'd every night.
  
 The problem:
 A simply query:
 select count(*) from xxx where CID=smalval
 where smalval is a CID value which has relatively few rows, returns 
 a
 plan using the index on that column.
  
explain analyze select count(*) from xxx where cid=869366;
Aggregate  (cost=19136.33..19136.33 rows=1 width=0) (actual
 time=78.49..78.49 rows=1 loops=1)
  -  Index Scan using xxx_cid on emailrcpts  (cost=0.00..19122.21 
 rows=5648 width=0) (actual time=63.40..78.46 rows=1 loops=1)
Total runtime: 78.69 msec
  
 The same plan is true for values which have up to about 20,000 rows:
  
explain analyze select count(*) from xxx where cid=6223341;
Aggregate  (cost=74384.19..74384.19 rows=1 width=0) (actual
 time=11614.89..11614.89 rows=1 loops=1)
  -  Index Scan using xxx_cid on emailrcpts  (cost=0.00..74329.26 
 rows=21974 width=0) (actual time=35.75..11582.10 rows=20114 loops=1)
Total runtime: 11615.05 msec
 However for the values that have  20,000 rows, the plan changes to a 
 sequential scan, which is proportionately much slower.
  
explain analyze select count(*) from xxx where cid=7191032;
Aggregate  (cost=97357.61..97357.61 rows=1 width=0) (actual
 time=46427.81..46427.82 rows=1 loops=1)
 -   Seq Scan on xxx (cost=0.00..97230.62 rows=50792 width=0) 
 (actual time=9104.45..46370.27 rows=37765 loops=1)
 Total runtime: 46428.00 msec
  
  
 The question: why does the planner consider a sequential scan to be
 better for these top 10 values?  In terms of elapsed time it is more 
 than twice as slow, proportionate to an index scan for the same number

 of rows.
  
 What I tried:
  
 A) alter table xxx alter column cid set statistics 500;   
 analyze xxx;
 This does not affect the results.
  
 B)  dropped/rebuilt the index, with no improvement.
  
 C) decreasing cpu_index_tuple_cost by a factor of up to 1000, with no
 success
  
 D) force an index scan for the larger values by using a very high 
 value
 for cpu_tuple_cost (e.g. .5) but this doesn't seem like a wise thing
to do.
  
 Your thoughts appreciated in advance!
  
 - Jeremy
  
 7+ years experience in Oracle performance-tuning
 relatively new to postgresql


-- 
Bill Moran
Potential Technologies
http://www.potentialtech.com


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] index v. seqscan for certain values

2004-04-12 Thread Stephan Szabo

On Mon, 12 Apr 2004, Jeremy Dunn wrote:

explain analyze select count(*) from xxx where cid=6223341;
Aggregate  (cost=74384.19..74384.19 rows=1 width=0) (actual
 time=11614.89..11614.89 rows=1 loops=1)
  -  Index Scan using xxx_cid on emailrcpts  (cost=0.00..74329.26
 rows=21974 width=0) (actual time=35.75..11582.10 rows=20114 loops=1)
Total runtime: 11615.05 msec

 However for the values that have  20,000 rows, the plan changes to a
 sequential scan, which is proportionately much slower.

explain analyze select count(*) from xxx where cid=7191032;
Aggregate  (cost=97357.61..97357.61 rows=1 width=0) (actual
 time=46427.81..46427.82 rows=1 loops=1)
 -   Seq Scan on xxx (cost=0.00..97230.62 rows=50792 width=0)
 (actual time=9104.45..46370.27 rows=37765 loops=1)
 Total runtime: 46428.00 msec

 The question: why does the planner consider a sequential scan to be
 better for these top 10 values?  In terms of elapsed time it is more
 than twice as slow, proportionate to an index scan for the same number
 of rows.

One thing to do is to set enable_seqscan=off and run the above and compare
the estimated and real costs.  It may be possible to lower
random_page_cost to a still reasonable number in order to move the point
of the switchover to seqscan.

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] index v. seqscan for certain values

2004-04-12 Thread Tom Lane
Jeremy Dunn [EMAIL PROTECTED] writes:
 The question: why does the planner consider a sequential scan to be
 better for these top 10 values?

At some point a seqscan *will* be better.  In the limit, if the key
being sought is common enough to occur on every page of the table,
it's certain that a seqscan will require less I/O than an indexscan
(because reading the index isn't actually saving you any heap fetches).
In practice the breakeven point is less than that because Unix kernels
are better at handling sequential than random access.

Your gripe appears to be basically that the planner's idea of the
breakeven point is off a bit.  It looks to me like it's within about
a factor of 2 of being right, though, which is not all that bad when
it's using generic cost parameters.

 A) alter table xxx alter column cid set statistics 500;
 analyze xxx;
 This does not affect the results.

It probably improved the accuracy of the row count estimates, no?
The estimate you show for cid=7191032 is off by more than 25% (37765 vs
50792), which seems like a lot of error for one of the most common
values in the table.  (I hope that was with default stats target and
not 500.)  That leads directly to a 25% overestimate of the cost of
an indexscan, while having IIRC no impact on the cost of a seqscan.
Since the cost ratio was more than 25%, this didn't change the selected
plan, but you want to fix that error as best you can before you move
on to tweaking cost parameters.

 C) decreasing cpu_index_tuple_cost by a factor of up to 1000, with no
 success

Wrong thing.  You should be tweaking random_page_cost.  Looks to me like
a value near 2 might be appropriate for your setup.  Also it is likely
appropriate to increase effective_cache_size, which is awfully small in
the default configuration.  I'd set that to something related to your
available RAM before trying to home in on a suitable random_page_cost.

AFAIK hardly anyone bothers with changing the cpu_xxx costs ...

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] index v. seqscan for certain values

2004-04-12 Thread Jeremy Dunn
 Jeremy Dunn [EMAIL PROTECTED] writes:
  The question: why does the planner consider a sequential scan to be 
  better for these top 10 values?
 
 At some point a seqscan *will* be better.  In the limit, if 
 the key being sought is common enough to occur on every page 
 of the table, it's certain that a seqscan will require less 
 I/O than an indexscan (because reading the index isn't 
 actually saving you any heap fetches). In practice the 
 breakeven point is less than that because Unix kernels are 
 better at handling sequential than random access.
 
 Your gripe appears to be basically that the planner's idea of 
 the breakeven point is off a bit.  It looks to me like it's 
 within about a factor of 2 of being right, though, which is 
 not all that bad when it's using generic cost parameters.

Agreed.  However, given that count(*) is a question that can be answered
_solely_ using the index (without reference to the actual data blocks),
I'd expect that the break-even point would be considerably higher than
the  3% (~38,000 / ~1.3M) I'm currently getting.  Does PG not use
solely the index in this situation??

  A) alter table xxx alter column cid set statistics 500;
  analyze xxx;
  This does not affect the results.
 
 It probably improved the accuracy of the row count estimates, 
 no? The estimate you show for cid=7191032 is off by more than 
 25% (37765 vs 50792), which seems like a lot of error for one 
 of the most common values in the table.  (I hope that was 
 with default stats target and not 500.)  That leads directly 
 to a 25% overestimate of the cost of an indexscan, while 
 having IIRC no impact on the cost of a seqscan. Since the 
 cost ratio was more than 25%, this didn't change the selected 
 plan, but you want to fix that error as best you can before 
 you move on to tweaking cost parameters.

Actually it made them worse!  Yes, this was the default statistics (10).
When I just tried it again with a value of 300, analyze, then run the
query, I get a *worse* result for an estimate.  I don't understand this.


   alter table xxx alter column cid set statistics 300;
   analyze emailrcpts;
   set random_page_cost to 2;
   explain analyze select count(*) from xxx where cid=7191032;

   Aggregate  (cost=20563.28..20563.28 rows=1 width=0) (actual
time=7653.90..7653.90 rows=1 loops=1)
  -  Index Scan using xxx_cid on xxx  (cost=0.00..20535.82 rows=10983
width=0) (actual time=72.24..7602.38 rows=37765 loops=1)
   Total runtime: 7654.14 msec

Now it estimates I have only 10,983 rows (~3x too low) instead of the
old estimate 50,792 (1.3x too high).  Why is that ??

Anyway, a workable solution seems to be using a lower value for
Random_Page_Cost.  Thanks to everyone who replied with this answer. 

 Also it is likely appropriate to increase 
 effective_cache_size, which is awfully small in the default 
 configuration.  I'd set that to something related to your 
 available RAM before trying to home in on a suitable random_page_cost.

We have ours set to the default value of 1000, which does seem low for a
system with 1GB of RAM.  We'll up this once we figure out what's
available.  Then tweak the Random_Page_Cost appropriately at that point.

I'd still like to understand the strangeness above, if anyone can shed
light.

- Jeremy


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] index v. seqscan for certain values

2004-04-12 Thread Bruno Wolff III
On Mon, Apr 12, 2004 at 15:05:02 -0400,
  Jeremy Dunn [EMAIL PROTECTED] wrote:
 
 Agreed.  However, given that count(*) is a question that can be answered
 _solely_ using the index (without reference to the actual data blocks),
 I'd expect that the break-even point would be considerably higher than
 the  3% (~38,000 / ~1.3M) I'm currently getting.  Does PG not use
 solely the index in this situation??

That isn't true. In order to check visibility you need to look at the
data blocks.

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] index v. seqscan for certain values

2004-04-12 Thread Tom Lane
Jeremy Dunn [EMAIL PROTECTED] writes:
 Agreed.  However, given that count(*) is a question that can be answered
 _solely_ using the index (without reference to the actual data blocks),

As Bruno noted, that is not the case in Postgres; we must visit the
table rows anyway.

 When I just tried it again with a value of 300, analyze, then run the
 query, I get a *worse* result for an estimate.  I don't understand this.

That's annoying.  How repeatable are these results --- if you do ANALYZE
over again several times, how much does the row count estimate change
each time?  (It should change somewhat, since ANALYZE is taking a random
sample, but one would like to think not a whole lot.)  Is the variance
more or less at the higher stats target?  Take a look at a few different
CID values to get a sense of the accuracy, don't look at just one ...

(Actually, you might find it more profitable to look at the pg_stats
entry for the CID column rather than reverse-engineering the stats via
ANALYZE.  Look at how well the most-common-values list and associated
frequency numbers track reality.)

Also, can you think of any reason for the distribution of CID values
to be nonuniform within the table?  For instance, do rows get inserted
in order of increasing CID, or is there any clustering of rows with the
same CID?

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match