This is excellent, specific feedback. Thanks! Given the relative costs, I was hoping L was the optimal tradeoff vs XL, but if that's the best option, that's the best option.
will On Wed, Mar 9, 2011 at 12:04 PM, Erik Onnen <eon...@gmail.com> wrote: > I'd recommend not storing commit logs or data files on EBS volumes if > your machines are under any decent amount of load. I say that for > three reasons. > > First, both EBS volumes contend directly for network throughput with > what appears to be a peer QoS policy to standard packets. In other > words, if you're saturating a network link, EBS throughput falls. The > same has not been true of ephemeral volumes in all of our testing, > ephemeral I/O speeds tend to only take a minor hit under network > pressure and are consistently faster in raw speed tests. > > Second, at some point it's a given that you will encounter misbehaving > EBS volumes. They won't completely fail, worse they will just get > really, really slow. Often times this is worse than a total failure > because the system just back piles reads/writes but doesn't totally > fall over until the entire cluster becomes overwhelmed. We've never > had single volume ephemeral problems. > > Lastly, I think people have a tendency to bolt a large number of EBS > volumes to a host and think that because they have disk capacity they > serve more data from fewer hosts. If you push that too far, you'll > outstrip the ability of the system to keep effective buffer caches and > concurrently serve requests for all the data it is responsible for > managing. IME there is pretty good parity between an EC2 XL and the > ephemeral disks available relative to how Cassandra uses disk and RAM > that adding more storage is right at the breaking point of over > committing your hardware. > > If you want protection from AZ failure, split you ring across AZs > (Cassandra is quite good at this) or copy snapshots to EBS volumes. > > -erik > > There are a lot of benefits to EBS volumes, I/O throughput and > reliability are not among those benefits. > > On Wed, Mar 9, 2011 at 8:39 AM, William Oberman > <ober...@civicscience.com> wrote: > > I thought nodetool snapshot writes the snapshot locally, requiring 2x of > > expensive storage allocation 24x7 (vs. cheap storage allocation of a ebs > > snapshot). By that I mean EBS allocation is GB allocated per month costs > at > > one rate, and EBS snapshots are delta compressed copies to S3. > > > > Can you point the snapshot to an external filesystem? > > > > will > > > > On Wed, Mar 9, 2011 at 11:31 AM, Sasha Dolgy <sdo...@gmail.com> wrote: > >> > >> Could you not nodetool snapshot the data into an mounted ebs/s3 bucket > and > >> satisfy your development requirement? > >> -sd > >> > >> On Wed, Mar 9, 2011 at 5:23 PM, William Oberman < > ober...@civicscience.com> > >> wrote: > >>> > >>> For me, to transition production data into a development environment > for > >>> real world testing. Also, backups are never a bad idea, though I agree > most > >>> all risk is mitigated due to cassandra's design. > >>> > >>> will > > > > > > > > -- > > Will Oberman > > Civic Science, Inc. > > 3030 Penn Avenue., First Floor > > Pittsburgh, PA 15201 > > (M) 412-480-7835 > > (E) ober...@civicscience.com > > > -- Will Oberman Civic Science, Inc. 3030 Penn Avenue., First Floor Pittsburgh, PA 15201 (M) 412-480-7835 (E) ober...@civicscience.com