Re: backing up data from cassandra

Jonathan Ellis Tue, 06 Oct 2009 14:47:29 -0700

I don't really see "nodeprobe snapshot" and "mv snapshotdir/* livedir"
as all that much harder, but maybe that's just me.


for a cluster, just add dsh.

-Jonathan

On Tue, Oct 6, 2009 at 3:42 PM, Joe Van Dyk <[email protected]> wrote:
> Sure not as easy as a "pg_dump db > dump.sql" and "psql db < dump.sql"
> though.  Oh well.
>
>
>
> On Tue, Oct 6, 2009 at 11:28 AM, Edmond Lau <[email protected]> wrote:
>> Thanks for the replies guys.  It sounds like restoration via snapshots
>> + some application-side logic to sanity check/repair any data around
>> the snapshot time is the way to go.
>>
>> Edmond
>>
>> On Mon, Oct 5, 2009 at 10:15 AM, Jonathan Ellis <[email protected]> wrote:
>>> On Mon, Oct 5, 2009 at 11:23 AM, Thorsten von Eicken <[email protected]> 
>>> wrote:
>>>> Isn't the question about how you back up a cassandra cluster, not a
>>>> single node?
>>>
>>> Sure, but the generalization is straightforward. :)
>>>
>>>> Can you snapshot the various nodes at different times or do
>>>> they need to be synchronized?
>>>
>>> The closer the synchronization, the more consistent they will be.
>>> (Since Cassandra is designed around eventual consistency, there's some
>>> flexibility here.  Conversely, there's no way to tell the system
>>> "don't accept any more writes until the snapshot is done.")
>>>
>>>> Is there a minimal set of nodes that are
>>>> sufficient to back up?
>>>
>>> Assuming your replication is 100% up to date, backing up every N nodes
>>> where N is the replication factor could be adequate in theory, but I
>>> wouldn't recommend trying to be clever like that, since if you
>>> "restored" from backup like that your system would be in a degraded
>>> state and vulnerable to any of the restored nodes failing.
>>>
>>> -Jonathan
>>>
>>
>
>
>
> --
> Joe Van Dyk
> http://fixieconsulting.com
>

Re: backing up data from cassandra

Reply via email to