Thanks for testing that, added a note to http://wiki.apache.org/cassandra/CassandraHardware on stripe size.
On Wed, Mar 10, 2010 at 11:03 AM, B. Todd Burruss <bburr...@real.com> wrote: > with the file sizes we're talking about with cassandra and other database > products, the stripe size doesn't seem to matter. i suppose there may be a > modicum of overhead with a small stripe size, but i'm not sure. mine is set > to 128k, which produced the same results as 16k and 256k. > > i will say the number of drives within the RAID 0 setup does seem to matter. > more you have the more parallelism you can get with a good RAID controller. > > Eric Rosenberry wrote: >> >> Based on the documentation, it is clear that with Cassandra you want to >> have one disk for commitlog, and one disk for data. >> >> My question is: If you think your workload is going to require more io >> performance to the data disks than a single disk can handle, how would you >> recommend effectively utilizing additional disks? >> >> It would seem a number of vendors sell 1U boxes with four 3.5 inch disks. >> If we use one for commitlog, is there a way to have Cassandra itself >> equally split data across the three remaining disks? Or is this something >> that needs to be handled by the hardware level, or operating system/file >> system level? >> >> Options include a hardware RAID controller in a RAID 0 stripe (this is >> more $$$ and for what gain?), or utilizing a volume manager like LVM. >> >> Along those same lines, if you do implement some type of striping, what >> RAID stripe size is recommended? (I think Todd Burruss asked this earlier >> but I did not see a response) >> >> Thanks for any input! >> >> -Eric >