RE: best way to backup

2011-04-30 Thread Jeremiah Jordan
The files inside the keyspace folders are the SSTable. From: aaron morton [mailto:aa...@thelastpickle.com] Sent: Friday, April 29, 2011 4:49 PM To: user@cassandra.apache.org Subject: Re: best way to backup William, Some info on the sstables from me http

Re: best way to backup

2011-04-30 Thread William Oberman
, Jeremiah Jordan jeremiah.jor...@morningstar.com wrote: The files inside the keyspace folders are the SSTable. -- *From:* aaron morton [mailto:aa...@thelastpickle.com] *Sent:* Friday, April 29, 2011 4:49 PM *To:* user@cassandra.apache.org *Subject:* Re: best way

Re: best way to backup

2011-04-29 Thread Daniel Doubleday
What we are about to set up is a time machine like backup. This is more like an add on to the s3 backup. Our boxes have an additional larger drive for local backup. We create a new backup snaphot every x hours which hardlinks the files in the previous snapshot (bit like cassandras

Re: best way to backup

2011-04-29 Thread William Oberman
Dumb question, but referenced twice now: which files are the SSTables and why is backing them up incrementally a win? Or should I not bother to understand internals, and instead just roll with the backup my keyspace(s) and system in a compressed tar strategy, as while it may be excessive, it's

Re: best way to backup

2011-04-29 Thread Jeremy Hanna
Good point - we plan to do regular testing to restore the cluster. Also we might spin up a snapshot of the cluster for testing as well. Also I wonder how much time compression will save when it comes to restores. I'll have to run some tests on that. Thanks for posting. Jeremy On Apr 28,

Re: best way to backup

2011-04-29 Thread aaron morton
William, Some info on the sstables from me http://thelastpickle.com/2011/04/28/Forces-of-Write-and-Read/ If you want to know more check out the BigTable and original Facebook papers, linked from the wiki Aaron On 29 Apr 2011, at 23:43, William Oberman wrote: Dumb question,

best way to backup

2011-04-28 Thread William Oberman
Even with N-nodes for redundancy, I still want to have backups. I'm an amazon person, so naturally I'm thinking S3. Reading over the docs, and messing with nodeutil, it looks like each new snapshot contains the previous snapshot as a subset (and I've read how cassandra uses hard links to avoid

Re: best way to backup

2011-04-28 Thread Sasha Dolgy
You could take a snapshot to an EBS volume. then, take a snapshot of that via AWS. of course, this is ok.when they -arent- having outages and issues ... On Apr 28, 2011 9:54 PM, William Oberman ober...@civicscience.com wrote: Even with N-nodes for redundancy, I still want to have backups. I'm

Re: best way to backup

2011-04-28 Thread William Oberman
Interesting. Both use cases seem easy to code. Compress to S3 = cassandra snapshot, tar, s3 put EBS = cassandra snapshot, rsync snapshot dir - ebs, ebs snapshot I think the former is cheaper in terms of costs, as my gut says keeping around an EBS drive is more money than the lack of deltas in

Re: best way to backup

2011-04-28 Thread Jeremy Hanna
one thing we're looking at doing is watching the cassandra data directory and backing up the sstables to s3 when they are created. Some guys at simplegeo started tablesnap that does this: https://github.com/simplegeo/tablesnap What it does is for every sstable that is pushed to s3, it also

Re: best way to backup

2011-04-28 Thread William Oberman
My newbie mistake (always good to test things): my script wasn't storing/restoring system, only my keyspace. So, if you want to be able to restore from backup, make sure you save the keyspace and system! will On Thu, Apr 28, 2011 at 4:35 PM, Jeremy Hanna jeremy.hanna1...@gmail.comwrote: one

Re: best way to backup

2011-04-28 Thread Adrian Cockcroft
Netflix has also gone down this path, we run a regular full backup to S3 of a compressed tar, and we have scripts that restore everything into the right place on a different cluster (it needs the same node count). We also pick up the SSTables as they are created, and drop them in S3. Whatever you