> On Jan 2, 2021, at 7:30 AM, Manu Chadha <manu.cha...@hotmail.com> wrote:
> 
> 
> Hi
>  
> Can I just copy the keyspace folders into new cassandra installation s backup 
> and restore strategy? I am trying to do that but it isn’t working.
>  
> I am using `K8ssandra` to run my single node C* cluster. I am experimenting 
> with data backup and restore. Though K8ssandra uses medusa for data backup 
> and restore, I could use it so I thought to test by simply copying/pasting 
> the data directory. But I don’t see my data after restore. There could be 
> mistakes in my approach so I am not really sure where to look. For example
> K8ssandra uses Kubernetes’ persistent Volume Claims. Does that mean that the 
> data is actually stored somewhere else and not in data directories of 
> keyspaces?
> Is there a way to look into the files in data directories of keyspaces to 
> check what data is there. Maybe the data isn’t backed up properly.
>  
> The steps I did to copy the data are:
> GKE cluster-> default-pool  -> found node running k8ssandra-dc1-default-sts-0 
> container
> Go to VM instances -> SSH to the node which is running 
> k8ssandra-dc1-default-sts-0 container
> Once SSHed, ran  “docker exec -it 
> k8s_cassandra_k8ssandra-dc1-default-sts-0_default_00b0d72a-c124-4b04-b25d-9e0f17edc582_0
>  /bin/bash”
> I noticed that the container has Cassandra :
> /opt/cassandra
> ./opt/cassandra/bin/cassandra
> ./opt/cassandra/javadoc/org/apache/cassandra
> ./var/lib/cassandra
> ./var/log/cassandra
>  
> cd opt/cassandra/data/data. There were directories for each keyspace. I 
> assume that when taking backups we can take a copy of this data directory. 
> Then once we need to restore, we can simply copy them back to new node’s data 
> directory.
>  
> Note that I couldn’t run nodetool inside the container (nodetool flush or 
> nodetool refresh) due to JMX issue. I don’t know how important it is to run 
> the command. There is no traffic running on the systems though.
>  
> I copied data directory from OUTSIDE container (from the node) using “docker 
> cp container name:src_path dest_path” (eg. docker cp 
> k8s_cassandra_k8ssandra-dc1-default-sts-0_default_00b0d72a-c124-4b04-b25d-9e0f17edc582_0:/opt/cassandra/data/data
>  backup/)
>  
> Then to transfer the backup directory to cloudshell (the console on web 
> browser), I used “gcloud compute scp --recurse 
> gke-k8ssandra-cluster-default-pool-1b1cc22a-rd6t:~/backup/data 
> ~/K8ssandra_data_backup”
> Then I copied from cloudshell to my laptop/workstation, using cloudshell 
> editor. This downloaded a tar of the backup (using a download link).
>  
> Then I downloaded a new .gz of C*3.11.6  on my laptop. After unzipping it, I 
> noticed that it hasn’t got a data directory. I ran C* and noticed that only 
> default keyspaces were present. I also noticed that data directory was now 
> created. I then stopped C*.
>  
> Then I copied contents of backup folder (only keyspace name folders, not all 
> folders) in data/data directory of a new Cassandra system which wasn’t 
> running. Then I restarted the c* system but I can’t see the data via cqlsh. I 
> can’t see the keyspace as well which probably is because I should probably 
> copy system and system-* folders. But is it safe to do so? I tried it but 
> landed into several issues around cluster name, snitch, data center names etc.

The schemas are stored in system_schema so until / unless you copy that it’s 
not gonna work.

Alternatively you can issue the DDL / CREATE statements on your laptop, it’ll 
make new directories, you can copy the data files into those directories. This 
is your safest and easiest option most of the time 

>  
> Would the approach of just copy/pasting folder work ?
>  
> Thanks
> Manu
> Sent from Mail for Windows 10
>  

Reply via email to