Re: Aamzon EC2 & Cassandra to ebs or not..

2011-03-09 Thread Jonathan Ellis
Right, local snapshot is no-cost both from an EC2 pricing standpoint and from a disk usage standpoint (because it uses hard links). On Wed, Mar 9, 2011 at 10:48 AM, Sasha Dolgy wrote: > Hi Will, > http://wiki.apache.org/cassandra/Operations#Backing_up_data > If the snapshot is written to the ephe

Re: Aamzon EC2 & Cassandra to ebs or not..

2011-03-09 Thread William Oberman
Based on Eric's email, it sounds like EBS is a no go from the start. But given your snapshot feedback, it seems like you have to plan on leaving slack on every disk, and the % of slack depends on the size of a snapshot relative to the data (given the snapshot shares the disk with the data, at leas

Re: Aamzon EC2 & Cassandra to ebs or not..

2011-03-09 Thread William Oberman
This is excellent, specific feedback. Thanks! Given the relative costs, I was hoping L was the optimal tradeoff vs XL, but if that's the best option, that's the best option. will On Wed, Mar 9, 2011 at 12:04 PM, Erik Onnen wrote: > I'd recommend not storing commit logs or data files on EBS vo

Re: Aamzon EC2 & Cassandra to ebs or not..

2011-03-09 Thread Erik Onnen
I'd recommend not storing commit logs or data files on EBS volumes if your machines are under any decent amount of load. I say that for three reasons. First, both EBS volumes contend directly for network throughput with what appears to be a peer QoS policy to standard packets. In other words, if y

Re: Aamzon EC2 & Cassandra to ebs or not..

2011-03-09 Thread Sasha Dolgy
Hi will, Quickly did a snapshot: nodetool -h 10.0.0.2 -p 8080 snapshot 09032011 The snapshots end up in the data dir for cassandra. The default is /var/lib/cassandra/data//snapshots/ In this directory i have: 1299689801925-09032011 -sd On Wed, Mar 9, 2011 at 5:54 PM, William Oberman wrote:

Re: Aamzon EC2 & Cassandra to ebs or not..

2011-03-09 Thread Frank LoVecchio
> > Now that I'm past the problems of IP addresses changing ... I am onto the > idea of storage. Initially I had though that for each cassandra instance, I > should have an EBS volume to store all the cassandra data / information. > Now I'm starting to wonder if this is duplication and not necessa

Re: Aamzon EC2 & Cassandra to ebs or not..

2011-03-09 Thread Dave Viner
Sasha, You might also check out http://coreyhulen.org/category/cassandra/ for speed tests done by Corey Hulan on different disk configurations (both inside ec2 and on real hw). If you write to the ephermeral storage on an EC2 instance, there is no additional cost for the data written. Mostly sim

Re: Aamzon EC2 & Cassandra to ebs or not..

2011-03-09 Thread William Oberman
I haven't done backups yet, so I don't know where the data is written. Is it where the nodetool is run from? Or local to the instance running cassandra (and there, local to the data directory?). I assumed it was the latter (not finding docs on that yet), and that would require 2x storage allocat

Re: Aamzon EC2 & Cassandra to ebs or not..

2011-03-09 Thread Sasha Dolgy
Hi Will, http://wiki.apache.org/cassandra/Operations#Backing_up_data If the snapshot is written to the ephemeral storage ... there isn't a cost. (i need to confirm that) You can then move this to an S3 bucket with RDS if you want or fu

Re: Aamzon EC2 & Cassandra to ebs or not..

2011-03-09 Thread William Oberman
I thought nodetool snapshot writes the snapshot locally, requiring 2x of expensive storage allocation 24x7 (vs. cheap storage allocation of a ebs snapshot). By that I mean EBS allocation is GB allocated per month costs at one rate, and EBS snapshots are delta compressed copies to S3. Can you poin

Re: Aamzon EC2 & Cassandra to ebs or not..

2011-03-09 Thread Sasha Dolgy
Could you not nodetool snapshot the data into an mounted ebs/s3 bucket and satisfy your development requirement? -sd On Wed, Mar 9, 2011 at 5:23 PM, William Oberman wrote: > For me, to transition production data into a development environment for > real world testing. Also, backups are never a b

Re: Aamzon EC2 & Cassandra to ebs or not..

2011-03-09 Thread Jeremy Hanna
I've seen both sides but Cassandra does handle replication and bringing data back is a matter of bootstrapping a node to replace the downed node. One thing to consider is availability zones and regions though. What happens if your entire cluster goes down in the case of a single datacenter go

Re: Aamzon EC2 & Cassandra to ebs or not..

2011-03-09 Thread William Oberman
For me, to transition production data into a development environment for real world testing. Also, backups are never a bad idea, though I agree most all risk is mitigated due to cassandra's design. will On Wed, Mar 9, 2011 at 10:57 AM, Sasha Dolgy wrote: > > well, this is what i'm getting at.

Re: Aamzon EC2 & Cassandra to ebs or not..

2011-03-09 Thread Sasha Dolgy
well, this is what i'm getting at. why would you want to back it up if the cluster is working properly? backup is silly ; ) On Wed, Mar 9, 2011 at 4:54 PM, William Oberman wrote: > I'm considering similar issues right now. The problem with ephemeral > storage is I don't know an easy way to

Re: Aamzon EC2 & Cassandra to ebs or not..

2011-03-09 Thread William Oberman
I'm considering similar issues right now. The problem with ephemeral storage is I don't know an easy way to back it up, while on an EBS it's a simple snapshot API call. Otherwise, I believe the performance of the ephemeral (certainly in the case of large or greater, where you can RAID0 multiple d