http://wiki.apache.org/cassandra/CassandraHardware
On Mon, Apr 26, 2010 at 1:06 PM, Edmond Lau <edm...@ooyala.com> wrote: > Ryan - > > You (or maybe someone else) mentioned using RAID-0 instead of multiple > data directories at the Cassandra hackathon as well. Could you > explain the motivation behind that? > > Thanks, > Edmond > > On Mon, Apr 26, 2010 at 9:53 AM, Ryan King <r...@twitter.com> wrote: >> I would recommend using RAID-0 rather that multiple data directories. >> >> -ryan >> >> 2010/4/26 Roland Hänel <rol...@haenel.me>: >>> I have a configuration like this: >>> >>> <DataFileDirectories> >>> <DataFileDirectory>/storage01/cassandra/data</DataFileDirectory> >>> <DataFileDirectory>/storage02/cassandra/data</DataFileDirectory> >>> <DataFileDirectory>/storage03/cassandra/data</DataFileDirectory> >>> </DataFileDirectories> >>> >>> After loading a big chunk of data into cassandra, I end up wich some 70GB in >>> the first directory, and only about 10GB in the second and third one. All >>> rows are quite small, so it's not just some big rows that contain the >>> majority of data. >>> >>> Does Cassandra have the ability to 'see' the maximum available space in >>> these directory? I'm asking myself this question since my limit is 100GB, >>> and the first directory is approaching this limit... >>> >>> And, wouldn't it be better if Cassandra tried to 'load-balance' the files >>> inside the directories because this will result in better (read) performance >>> if the directories are on different disks (which is the case for me)? >>> >>> Any help is appreciated. >>> >>> Roland >>> >>> >> >