On 3/10/09 6:28 AM, "Matthew Wakeling" <matt...@flymine.org> wrote:
On Tue, 10 Mar 2009, henk de wit wrote:
> It is frequently said that for PostgreSQL the number 1 thing to pay
> attention to when increasing performance is the amount of IOPS a storage
> system is capable of. Now I wonder if there is any situation in which
> sequential IO performance comes into play. E.g. perhaps during a
> tablescan on a non-fragmented table, or during a backup or restore?
Yes, up to a point. That point is when a single CPU can no longer handle
the sequential transfer rate. Yes, there are some parallel restore
possibilities which will get you further. Generally it only takes a few
discs to max out a single CPU though.
This is not true if you have concurrent sequential scans. Then an array can
be tuned for total throughput with concurrent access. Single thread sequential
measurements are similarly useful to single thread random i/o measurement - not
really a test like the DB will act, but useful as a starting point for tuning.
I'm past the point where a single thread can not keep up with the disk on a
sequential scan. For the most simple select * queries, this is ~ 800MB/sec for
me.
For any queries those with more complicated processing/filtering, its much
less, usually 400MB/sec is a pretty good rate for a single thread.
However our raw array does about 1200MB/sec, and can get 75% efficiency on this
or so with between 4 and 8 concurrent sequential scans. It took some
significant tuning and testing time to make sure this worked, and to balance
that with random i/o requirements.
Furthermore, higher sequential rates help your random IOPS when you have
sequential access concurrent with random access. You can tune OS parameters
(readahead in linux, I/O scheduler types) to bias throughput or latency towards
random iops throughput or sequential MB/sec throughput. Having faster
sequential disk access means less % of time doing sequential I/O, meaning more
time left for random I/O. It only goes so far, but it does help with mixed
loads.
Overall, it depends a lot on how important sequential scans are to your use
case.