At 20:21 18/04/2005, Alex Turner wrote:
So I wonder if one could take this stripe size thing further and say that a larger stripe size is more likely to result in requests getting served parallized across disks which would lead to increased performance?
Actually, it would be pretty much the opposite. The smaller the stripe size, the more evenly distributed data is, and the more disks can be used to serve requests. If your stripe size is too large, many random accesses within one single file (whose size is smaller than the stripe size/number of disks) may all end up on the same disk, rather than being split across multiple disks (the extreme case being stripe size = total size of all disks, which means concatenation). If all accesses had the same cost (i.e. no seek time, only transfer time), the ideal would be to have a stripe size equal to the number of disks.
But below a certain size, you're going to use multiple disks to serve one single request which would not have taken much more time from a single disk (reading even a large number of consecutive blocks within one cylinder does not take much more time than reading a single block), so you would add unnecessary seeks on a disk that could have served another request in the meantime. You should definitely not go below the filesystem block size or the database block size.
There is a interesting discussion of the optimal stripe size in the vinum manpage on FreeBSD:
(look for "Performance considerations", towards the end -- note however that some of the calculations are not entirely correct).
Basically it says the optimal stripe size is somewhere between 256KB and 4MB, preferably an odd number, and that some hardware RAID controllers don't like big stripe sizes. YMMV, as always.
---------------------------(end of broadcast)--------------------------- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match