Hi Daniel, Yes, you are right it does require some additional work to rsync just the snapshots.
What about doing something like this to make rsync syntax for the backup easier? # in the Cassandra data directory, iterate through the keyspaces for ks in $(find . -type d -iname backup) do # iterate through each column family in the keyspace for cf in $(ls ${ks}) do # get the directory without the 'backup' path component in it out_ks=$(echo ${ks} | cut -d'/' -f2,3) # make backup directory and perform the rsync mkdir -p <YOUR_BACKUP_DIR>/${out_ks}/${cf} rsync -azP ${ks}/${cf}/ <YOUR_BACKUP_DIR>/${out_ks}/${cf} done done Regards, Anthony On 12 May 2017 at 18:00, Daniel Hölbling-Inzko < daniel.hoelbling-in...@bitmovin.com> wrote: > Hi Varun, > yes you are right - that's the structure that gets created. But if I want > to backup ALL columnfamilies at once this requires a quite complex rsync as > Vladimir mentioned. > I can't just copy over the /data/keyspace directory as that contains all > the data AND all the snapshots. I really have to go through this > columnfamily by columnfamily which is annoying. > > greetings Daniel > > On Thu, 11 May 2017 at 22:48 Varun Gupta <var...@uber.com> wrote: > >> >> I did not get your question completely, with "snapshot files are mixed >> with files and backup files". >> >> When you call nodetool snapshot, it will create a directory with snapshot >> name if specified or current timestamp at /data/<keyspace>/< >> columnfamily>/backup/<snapshotname>. This directory will have all >> sstables, metadata files and schema.cql (if using 3.0.9 or higher). >> >> >> On Thu, May 11, 2017 at 2:37 AM, Daniel Hölbling-Inzko < >> daniel.hoelbling-in...@bitmovin.com> wrote: >> >>> Hi, >>> I am going through this guide to do backup/restore of cassandra data to >>> a new cluster: >>> http://docs.datastax.com/en/cassandra/2.1/cassandra/ >>> operations/ops_backup_snapshot_restore_t.html#task_ds_cmf_11r_gk >>> >>> When creating a snapshot I get the snapshot files mixed in with the >>> normal data files and backup files, so it's all over the place and very >>> hard (especially with lots of tables per keyspace) to transfer ONLY the >>> snapshot. >>> (Mostly since there is a snapshot directory per table..) >>> >>> Am I missing something or is there some arcane shell command that >>> filters out only the snapshots? >>> Because this way it's much easier to just backup the whole data >>> directory. >>> >>> greetings Daniel >>> >> >>