Periodic snapshots + incremental backups I think are pretty good in terms
of restoring to point in time.  But you must manage cleaning up your
snapshots + incremental backups on your own.  I believe that tablesnap (
https://github.com/JeremyGrosser/tablesnap) is a pretty decent approach in
terms of keeping your sstables, per node, synched to a location off of your
host (on S3 in fact).  Not sure how portable it is to other block storage
services however.  S3+Lifecycle policy to go to Glacier would likely be the
most cost effective for long term retention.

On Thu, Jun 16, 2016 at 4:30 PM, Bhuvan Rawal <bhu1ra...@gmail.com> wrote:

> Also if we talk about backup strategy for Cassandra Data then essentially
> there are couple of strategies that are adopted:
>
> 1. Incremental Backups. The old sstables will remain inside a backup
> directory and can be shipped to a storage location like AWS Glacier, etc.
> 2. Snapshotting : Hardlinks of sstables will get created. This is a very
> fast process and latest data is captured into sstables after flushing
> memtables, snapshots will be created in snapshots directory. But snapshot
> does not provide you the feature to go back to a certain point in time but
> incremental backups give you that feature.
>
> Depending on the use case, you can use 1 or 2 or both.
>
> On Fri, Jun 17, 2016 at 4:46 AM, Bhuvan Rawal <bhu1ra...@gmail.com> wrote:
>
>> What kind of data are we talking here?
>> Is it time series data with infrequent updates and only inserts or
>> frequently updated data. How frequently is old data read. I ask this
>> because your Node size planning and Compaction Strategy will essentially
>> depend on these.
>>
>> I have known people go upto 3-5 TB per node if data is not updated
>> frequently.
>>
>> Regards,
>> Bhuvan
>>
>> On Fri, Jun 17, 2016 at 4:31 AM, <vasu.no...@gmail.com> wrote:
>>
>>> Bhuvan,
>>>
>>> Thanks for the info but actually I'm not looking for migration strategy.
>>> just want to backup strategy and retention policy best practices
>>>
>>> Thanks,
>>> Vasu
>>>
>>> On Jun 16, 2016, at 6:51 PM, Bhuvan Rawal <bhu1ra...@gmail.com> wrote:
>>>
>>> Hi Vasu,
>>>
>>> Planet Cassandra has a documentation page for basic info about migrating
>>> to cassandra from MySQL. What to expect and what not to. It can be found
>>> here <http://planetcassandra.org/mysql-to-cassandra-migration/>.
>>>
>>> I had a look at this slide
>>> <http://www.slideshare.net/planetcassandra/migration-best-practices-from-rdbms-to-cassandra-without-a-hitch>
>>>  a
>>> while back. It provides a pretty reliable 4 Phase Sync strategy, starting
>>> from Slide 31. Also the QA session of the talk is informative too -
>>> http://www.doanduyhai.com/blog/?p=1757.
>>>
>>> Best Regards,
>>> Bhuvan
>>>
>>> On Fri, Jun 17, 2016 at 4:03 AM, <vasu.no...@gmail.com> wrote:
>>>
>>>> Hi ,
>>>>
>>>> I'm from relational world recently started working on Cassandra. I'm
>>>> just wondering what is backup best practices for DB around 100 Tb with
>>>> multi DC setup.
>>>>
>>>>
>>>> Thanks,
>>>> Vasu
>>>
>>>
>>>
>>
>

Reply via email to