Hi,

We have a cassandra cluster built on Apache Cassandra 3.9 with 6 nodes and
RF = 3. As part of re-building the cluster, we are testing the backup and
restore strategy.

We took the snapshot and uploaded the files to S3 and data has been saved
the data with folder names (backup_folder1 - 6 for nodes 1 - 6).
Created a new cluster with the same number of nodes, and copied the data
from S3 and created the schema.

*Strategy 1: (using nodetool refresh)*
1) Copied back the data from S3 into one machine each based on the folders
created (backup_folder1  - 6 to 6 nodes)
2) and performed nodetool refresh on the cluster.

Ran the count:

Count on previous cluster: 12125800
Count on new cluster: 10504780

*Strategy 2: using sstableloader*

1) Copied back the data from S3 into one machine each based on the folders
created (backup_folder1  - 6 to 6 nodes)
2) and performed sstableloader on each node.

Ran the count:

Count on previous cluster: 12125800
Count on new cluster: 11705084


Looking at the results, i have bit disappointed that neither of the
approach resulted 100% restore for me.
If there is an error in taking the backup, it should have not given
different counts.

Any ideas on successful back-up and restore strategies.? and what could ve
gone wrong in my process.?

Thank You,
Regards,
Srini

Reply via email to