If you lose RF + 1 nodes the data that is replicated to only these nodes is
gone, good idea to have a recent backup than. Another situation is when you
deploy a bug in the software and start writing crap data to Cassandra.
Replication does not help and depending on the situation you need to
restore the backup.


2013/12/7 Jason Wee <peich...@gmail.com>

> Hmm... cassandra fundamental key features like fault tolerant, durable and
> replication. Just out of curiousity, why would you want to do backup?
>
> /Jason
>
>
> On Sat, Dec 7, 2013 at 3:31 AM, Robert Coli <rc...@eventbrite.com> wrote:
>
>> On Fri, Dec 6, 2013 at 6:41 AM, Amalrik Maia <amal...@s1mbi0se.com.br>wrote:
>>
>>> hey guys, I'm trying to take backups of a multi-node cassandra and save
>>> them on S3.
>>> My idea is simply doing ssh to each server and use nodetool to create
>>> the snapshots then push then to S3.
>>>
>>
>> https://github.com/synack/tablesnap
>>
>> So is this approach recommended? my concerns are about inconsistencies
>>> that this approach can lead, since the snapshots are taken one by one and
>>> not in parallel.
>>> Should i worry about it or cassandra finds a way to deal with
>>> inconsistencies when doing a restore?
>>>
>>
>> The backup is as consistent as your cluster is at any given moment, which
>> is "not necessarily". Manual repair brings you closer to consistency, but
>> only on data present when the repair started.
>>
>> =Rob
>>
>
>

Reply via email to