Re: [PERFORM] Performance With Joins on Large Tables

2006-09-13 Thread Terje Elde

Jeff Davis wrote:

Is it overestimating the cost of using indexes or underestimating the
cost of a seq scan, or both? Maybe explain with the 0.1 setting will
help?
  


If enable_seqscan is off, and cost is still set to 1, it could 
be that it's quite simply forcibly underestimating the cost of a seqscan 
in this case.


If enable_secscan was off for the mentioned plan, it'd be interesting to 
see if things would be saner with seqscans enabled, and a more 
reasonable random page cost.  If more 'sane' values still produce the 
desired plan, it might be better for other plans etc.


Terje


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Blocks read for index scans

2006-04-18 Thread Terje Elde

Jim C. Nasby wrote:
That said, it's the transactions against disk that typically matter.  On 
FreeBSD, you can get an impression of this using 'systat -vmstat', and 
watch the KB/t column for your drives.



On a related note, you know of any way to determine the breakdown
between read activity and write activity on FreeBSD? vmstat, systat,
iostat all only return aggregate info. :(
  



Can't think of a right way to do this ATM, but for a lab-type setup to 
get an idea, you could set up a gmirror volume, then choose a balancing 
algorithm to only read from one of the disks.  The effect should be that 
writes go to both, while reads only go to one.  Activity on the 
write-only disk would give you an idea of the write activity, and 
(read/write disk - write-only disk) would give you an idea of the 
reads.  I have to admit though, seems like quite a bit of hassle, and 
I'm not sure how good the numbers would be, given that at least some of 
the info (KB/transaction) are totals, it'd require a bit of math to get 
decent numbers.  But at least it's something.


Terje



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Blocks read for index scans

2006-04-14 Thread Terje Elde

Jim Nasby wrote:
While working on determining a good stripe size for a database, I 
realized it would be handy to know what the average request size is. 
Getting this info is a simple matter of joining pg_stat_all_tables and 
pg_statio_all_tables and doing some math, but there's one issue I've 
found; it appears that there's no information on how many heap blocks 
were read in by an index scan. Is there any way to get that info?


RAID usually doesn't work the way most people think. ;)

Not sure how well you know RAID, so I'm just mentioning some points just 
in case, and for the archives.


If your average request is for 16K, and you choose a 16K stripe size, 
then that means half your request (assuming normal bell curve) would be 
larger than a single stripe, and you've just succeeded in having half 
your requests have to have two spindles seek instead of one.  If that's 
done sequentially, you're set for less than half the performance of a 
flat disk.


Knowing what the average stripe size is can be a good place to start, 
but the real question is;  which stripe size will allow the majority of 
your transactions to be possible to satisfy without having to go to two 
spindles?


I've actually had good success with 2MB stripe sizes using software 
raid.  If the reads are fairly well distributed, all the drives are hit 
equally, and very few small requests have to go to two spindles.


Read speeds from modern drives are fast.  It's usually the seeks that 
kill performance, so making sure you reduce the number of seeks should 
almost always be the priority.


That said, it's the transactions against disk that typically matter.  On 
FreeBSD, you can get an impression of this using 'systat -vmstat', and 
watch the KB/t column for your drives.


A seek will take some time, the head has to settle down, find the right 
place to start reading etc, so a seek will always take time.  A seek 
over a longer distance takes more time though, so even if your 
transactions are pretty small, using a large stripe size can be a good 
thing if your have lots of small transactions that are close by.  The 
head will be in the area, reducing seek time.


This all depends on what types of load you have, and it's hard to 
generalize too much on what makes things fast.  As always, it pretty 
much boils down to trying things while running as close to production 
load as you can.


Terje



---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org