Hi Daniel,

Yes, you are right it does require some additional work to rsync just the
snapshots.

What about doing something like this to make rsync syntax for the backup
easier?

# in the Cassandra data directory, iterate through the keyspaces
for ks in $(find . -type d -iname backup)
do
  # iterate through each column family in the keyspace
  for cf in $(ls ${ks})
  do
    # get the directory without the 'backup' path component in it
    out_ks=$(echo ${ks} | cut -d'/' -f2,3)

    # make backup directory and perform the rsync
    mkdir -p <YOUR_BACKUP_DIR>/${out_ks}/${cf}
    rsync -azP ${ks}/${cf}/ <YOUR_BACKUP_DIR>/${out_ks}/${cf}
  done
done

Regards,
Anthony

On 12 May 2017 at 18:00, Daniel Hölbling-Inzko <
daniel.hoelbling-in...@bitmovin.com> wrote:

> Hi Varun,
> yes you are right - that's the structure that gets created. But if I want
> to backup ALL columnfamilies at once this requires a quite complex rsync as
> Vladimir mentioned.
> I can't just copy over the /data/keyspace directory as that contains all
> the data AND all the snapshots. I really have to go through this
> columnfamily by columnfamily which is annoying.
>
> greetings Daniel
>
> On Thu, 11 May 2017 at 22:48 Varun Gupta <var...@uber.com> wrote:
>
>>
>> I did not get your question completely, with "snapshot files are mixed
>> with files and backup files".
>>
>> When you call nodetool snapshot, it will create a directory with snapshot
>> name if specified or current timestamp at /data/<keyspace>/<
>> columnfamily>/backup/<snapshotname>. This directory will have all
>> sstables, metadata files and schema.cql (if using 3.0.9 or higher).
>>
>>
>> On Thu, May 11, 2017 at 2:37 AM, Daniel Hölbling-Inzko <
>> daniel.hoelbling-in...@bitmovin.com> wrote:
>>
>>> Hi,
>>> I am going through this guide to do backup/restore of cassandra data to
>>> a new cluster:
>>> http://docs.datastax.com/en/cassandra/2.1/cassandra/
>>> operations/ops_backup_snapshot_restore_t.html#task_ds_cmf_11r_gk
>>>
>>> When creating a snapshot I get the snapshot files mixed in with the
>>> normal data files and backup files, so it's all over the place and very
>>> hard (especially with lots of tables per keyspace) to transfer ONLY the
>>> snapshot.
>>> (Mostly since there is a snapshot directory per table..)
>>>
>>> Am I missing something or is there some arcane shell command that
>>> filters out only the snapshots?
>>> Because this way it's much easier to just backup the whole data
>>> directory.
>>>
>>> greetings Daniel
>>>
>>
>>

Reply via email to