Hi

Can I just copy the keyspace folders into new cassandra installation s backup 
and restore strategy? I am trying to do that but it isn’t working.

I am using `K8ssandra` to run my single node C* cluster. I am experimenting 
with data backup and restore. Though K8ssandra uses medusa for data backup and 
restore, I could use it so I thought to test by simply copying/pasting the data 
directory. But I don’t see my data after restore. There could be mistakes in my 
approach so I am not really sure where to look. For example

  1.  K8ssandra uses Kubernetes’ persistent Volume Claims. Does that mean that 
the data is actually stored somewhere else and not in data directories of 
keyspaces?
  2.  Is there a way to look into the files in data directories of keyspaces to 
check what data is there. Maybe the data isn’t backed up properly.

The steps I did to copy the data are:
GKE cluster-> default-pool  -> found node running k8ssandra-dc1-default-sts-0 
container
Go to VM instances -> SSH to the node which is running 
k8ssandra-dc1-default-sts-0 container
Once SSHed, ran  “docker exec -it 
k8s_cassandra_k8ssandra-dc1-default-sts-0_default_00b0d72a-c124-4b04-b25d-9e0f17edc582_0
 /bin/bash”
I noticed that the container has Cassandra :
/opt/cassandra
./opt/cassandra/bin/cassandra
./opt/cassandra/javadoc/org/apache/cassandra
./var/lib/cassandra
./var/log/cassandra

cd opt/cassandra/data/data. There were directories for each keyspace. I assume 
that when taking backups we can take a copy of this data directory. Then once 
we need to restore, we can simply copy them back to new node’s data directory.

Note that I couldn’t run nodetool inside the container (nodetool flush or 
nodetool refresh) due to JMX issue. I don’t know how important it is to run the 
command. There is no traffic running on the systems though.

I copied data directory from OUTSIDE container (from the node) using “docker cp 
container name:src_path dest_path” (eg. docker cp 
k8s_cassandra_k8ssandra-dc1-default-sts-0_default_00b0d72a-c124-4b04-b25d-9e0f17edc582_0:/opt/cassandra/data/data
 backup/)

Then to transfer the backup directory to cloudshell (the console on web 
browser), I used “gcloud compute scp --recurse 
gke-k8ssandra-cluster-default-pool-1b1cc22a-rd6t:~/backup/data 
~/K8ssandra_data_backup”
Then I copied from cloudshell to my laptop/workstation, using cloudshell 
editor. This downloaded a tar of the backup (using a download link).

Then I downloaded a new .gz of C*3.11.6  on my laptop. After unzipping it, I 
noticed that it hasn’t got a data directory. I ran C* and noticed that only 
default keyspaces were present. I also noticed that data directory was now 
created. I then stopped C*.

Then I copied contents of backup folder (only keyspace name folders, not all 
folders) in data/data directory of a new Cassandra system which wasn’t running. 
Then I restarted the c* system but I can’t see the data via cqlsh. I can’t see 
the keyspace as well which probably is because I should probably copy system 
and system-* folders. But is it safe to do so? I tried it but landed into 
several issues around cluster name, snitch, data center names etc.

Would the approach of just copy/pasting folder work ?

Thanks
Manu
Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for Windows 10

Reply via email to