Re: Effective allocation of multiple disks

2010-03-12 Thread Eric Rosenberry
Ryan- Are you going to use software or hardware based RAID 0? Does anyone on the list have any data to compare the performance of hardware RAID 0 vs. software LVM RAID 0? I would think software RAID 0 would be fine since there is no actual computation being done... Thanks! -Eric On Thu, Mar

Re: Effective allocation of multiple disks

2010-03-12 Thread Ted Zlatanov
On Thu, 11 Mar 2010 12:01:27 -0600 Eric Evans eev...@rackspace.com wrote: EE On Wed, 2010-03-10 at 23:20 -0600, Jonathan Ellis wrote: On Wed, Mar 10, 2010 at 9:31 PM, Anthony Molinaro antho...@alumni.caltech.edu wrote: I would almost recommend just keeping things simple and removing

Re: Effective allocation of multiple disks

2010-03-12 Thread Ryan King
We're going to us software raid. -ryan On Fri, Mar 12, 2010 at 9:24 AM, Eric Rosenberry epros...@gmail.com wrote: Ryan- Are you going to use software or hardware based RAID 0? Does anyone on the list have any data to compare the performance of hardware RAID 0 vs. software LVM RAID 0? I

Re: Effective allocation of multiple disks

2010-03-11 Thread Eric Evans
On Wed, 2010-03-10 at 23:20 -0600, Jonathan Ellis wrote: On Wed, Mar 10, 2010 at 9:31 PM, Anthony Molinaro antho...@alumni.caltech.edu wrote: I would almost recommend just keeping things simple and removing multiple data directories from the config altogether and just documenting that you

Re: Effective allocation of multiple disks

2010-03-11 Thread Jonathan Ellis
Except that for a major compaction the whole thing gets put in one directory. That's the problem w/ the JBOD approach. On Thu, Mar 11, 2010 at 12:01 PM, Eric Evans eev...@rackspace.com wrote: On Wed, 2010-03-10 at 23:20 -0600, Jonathan Ellis wrote: On Wed, Mar 10, 2010 at 9:31 PM, Anthony

Re: Effective allocation of multiple disks

2010-03-11 Thread Anthony Molinaro
I'm still wondering what happens when you have something like 2 500GB disks, with 2 sstables which use up 25OGB, one on each disk, then a major compaction occurs. Will it still compact and probably fill up a disk (especially with the 2x overhead of compaction mentioned either here or on the

Re: Effective allocation of multiple disks

2010-03-11 Thread Ryan King
On Thu, Mar 11, 2010 at 10:45 AM, Jonathan Ellis jbel...@gmail.com wrote: Except that for a major compaction the whole thing gets put in one directory.  That's the problem w/ the JBOD approach. Even without major compaction, you can get significant imbalances in how much data is on each disk

RE: Effective allocation of multiple disks

2010-03-10 Thread Stu Hood
You can list multiple DataFileDirectories, and Cassandra will scatter files across all of them. Use 1 disk for the commitlog, and 3 disks for data directories. See http://wiki.apache.org/cassandra/CassandraHardware#Disk Thanks, Stu -Original Message- From: Eric Rosenberry

Re: Effective allocation of multiple disks

2010-03-10 Thread Eric Rosenberry
Ahh, thanks! I had read that, but I had assumed the reference to use one or more devices for DataFileDirectories was referring to somehow making multiple physical devices into one logical device via some underlying RAID system. So then as far as free space on the disks go, I have seen references

Re: Effective allocation of multiple disks

2010-03-10 Thread Jonathan Ellis
Thanks for testing that, added a note to http://wiki.apache.org/cassandra/CassandraHardware on stripe size. On Wed, Mar 10, 2010 at 11:03 AM, B. Todd Burruss bburr...@real.com wrote: with the file sizes we're talking about with cassandra and other database products, the stripe size doesn't seem

Re: Effective allocation of multiple disks

2010-03-10 Thread Anthony Molinaro
This is incorrect, as discussed a few weeks ago. I have a setup with multiple disks, and as soon as compaction occurs all the data ends up on one disk. If you need the additional io, you will want raid0. But simply listing multiple DataFileDirectories will not work. -Anthony On Wed, Mar 10,

Re: Effective allocation of multiple disks

2010-03-10 Thread Stu Hood
- From: Anthony Molinaro antho...@alumni.caltech.edu Sent: Wednesday, March 10, 2010 3:38pm To: cassandra-user@incubator.apache.org Subject: Re: Effective allocation of multiple disks This is incorrect, as discussed a few weeks ago. I have a setup with multiple disks, and as soon as compaction occurs

Re: Effective allocation of multiple disks

2010-03-10 Thread Anthony Molinaro
thoroughly, and I have some ideas there. -Original Message- From: Anthony Molinaro antho...@alumni.caltech.edu Sent: Wednesday, March 10, 2010 3:38pm To: cassandra-user@incubator.apache.org Subject: Re: Effective allocation of multiple disks This is incorrect, as discussed a few

Re: Effective allocation of multiple disks

2010-03-10 Thread Jonathan Ellis
On Wed, Mar 10, 2010 at 9:31 PM, Anthony Molinaro antho...@alumni.caltech.edu wrote: I would almost recommend just keeping things simple and removing multiple data directories from the config altogether and just documenting that you should plan on using OS level mechanisms for growing