Hi Can I just copy the keyspace folders into new cassandra installation s backup and restore strategy? I am trying to do that but it isn’t working.
I am using `K8ssandra` to run my single node C* cluster. I am experimenting with data backup and restore. Though K8ssandra uses medusa for data backup and restore, I could use it so I thought to test by simply copying/pasting the data directory. But I don’t see my data after restore. There could be mistakes in my approach so I am not really sure where to look. For example 1. K8ssandra uses Kubernetes’ persistent Volume Claims. Does that mean that the data is actually stored somewhere else and not in data directories of keyspaces? 2. Is there a way to look into the files in data directories of keyspaces to check what data is there. Maybe the data isn’t backed up properly. The steps I did to copy the data are: GKE cluster-> default-pool -> found node running k8ssandra-dc1-default-sts-0 container Go to VM instances -> SSH to the node which is running k8ssandra-dc1-default-sts-0 container Once SSHed, ran “docker exec -it k8s_cassandra_k8ssandra-dc1-default-sts-0_default_00b0d72a-c124-4b04-b25d-9e0f17edc582_0 /bin/bash” I noticed that the container has Cassandra : /opt/cassandra ./opt/cassandra/bin/cassandra ./opt/cassandra/javadoc/org/apache/cassandra ./var/lib/cassandra ./var/log/cassandra cd opt/cassandra/data/data. There were directories for each keyspace. I assume that when taking backups we can take a copy of this data directory. Then once we need to restore, we can simply copy them back to new node’s data directory. Note that I couldn’t run nodetool inside the container (nodetool flush or nodetool refresh) due to JMX issue. I don’t know how important it is to run the command. There is no traffic running on the systems though. I copied data directory from OUTSIDE container (from the node) using “docker cp container name:src_path dest_path” (eg. docker cp k8s_cassandra_k8ssandra-dc1-default-sts-0_default_00b0d72a-c124-4b04-b25d-9e0f17edc582_0:/opt/cassandra/data/data backup/) Then to transfer the backup directory to cloudshell (the console on web browser), I used “gcloud compute scp --recurse gke-k8ssandra-cluster-default-pool-1b1cc22a-rd6t:~/backup/data ~/K8ssandra_data_backup” Then I copied from cloudshell to my laptop/workstation, using cloudshell editor. This downloaded a tar of the backup (using a download link). Then I downloaded a new .gz of C*3.11.6 on my laptop. After unzipping it, I noticed that it hasn’t got a data directory. I ran C* and noticed that only default keyspaces were present. I also noticed that data directory was now created. I then stopped C*. Then I copied contents of backup folder (only keyspace name folders, not all folders) in data/data directory of a new Cassandra system which wasn’t running. Then I restarted the c* system but I can’t see the data via cqlsh. I can’t see the keyspace as well which probably is because I should probably copy system and system-* folders. But is it safe to do so? I tried it but landed into several issues around cluster name, snitch, data center names etc. Would the approach of just copy/pasting folder work ? Thanks Manu Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for Windows 10