Re: Move Production data to Development Cluster

2017-07-07 Thread Pranay akula
Hello *Jonathan,*

As both clusters size is same.

 Do I copy the snapshots from all the nodes?  yes this will work, just make
sure that ur copying data to nodes with assiociated tokens.


Thanks
Pranay.

On Fri, Jul 7, 2017 at 10:48 AM, Jonathan Baynes <
jonathan.bay...@tradeweb.com> wrote:

> Hi,
>
>
>
> Can anyone help me. I’m trying (and failing) to move my 3 node C* data
> from my Production Environment to my Development 3 node cluster.
>
>
>
> Here is the fine print…
>
>
>
> Oracle Linux 7.3
>
> C* 3.0.11
>
>
>
> 3 Nodes ((virtual Nodes 256))
>
> 1 Keyspace (replication factor 3) Quorum Consistency
>
> 1 table
>
>
>
> Snapshot taken on each node.
>
>
>
> *Attempt 1*
>
>
>
> I’ve tried the following (http://docs.datastax.com/en/
> cassandra/3.0/cassandra/operations/opsSnapshotRestoreNewCluster.html)
>
>
>
>
>
>1. From the old cluster, retrieve the list of tokens associated with
>each node's IP:
>
> $ nodetool ring | grep ip_address_of_node | awk '{print $NF ","}' | xargs
>
>
>
> I’ve done this for all 3 nodes, placed them together in one string
>
>
>1. In the cassandra.yaml
>
> 
>  file
>for each node in the new cluster, add the list of tokens you obtained in
>the previous step to the initial_token
>
> 
>  parameter
>using the same *num_tokens* setting as in the old cluster.
>
> Added all the tokens from step one
>
>1. Make any other necessary changes in the new cluster's cassandra.yaml
>
> 
>  and
>property files so that the new nodes match the old cluster settings. Make
>sure the seed nodes are set for the new cluster.
>2. Clear the system table data from each new node:
>
> $ sudo rm -rf /var/lib/cassandra/data/system/*
>
> This allows the new nodes to use the initial tokens defined in the
> cassandra.yaml
> 
>  when
> they restart.
>
>1. Start each node using the specified list of token ranges in new
>cluster's cassandra.yaml
>
> 
>:
>
> initial_token: -9211270970129494930, -9138351317258731895, 
> -8980763462514965928, ...
>
>
>1. Create schema in the new cluster. All the schemas from the old
>cluster must be reproduced in the new cluster.
>2. Stop the node
>
> .
>Using nodetool refresh is unsafe because files within the data
>directory of a running node can be silently overwritten by identically
>named just-flushed SSTables from memtable flushes or compaction. Copying
>files into the data directory and restarting the node will not work for the
>same reason.
>3. Restore the SSTable files snapshotted
>
> 
>  from
>the old cluster onto the new cluster using the same directories, while
>noting that the UUID component of target directory names has changed.
>Without restoration, the new cluster will not have data to read upon
>restart.
>4. Restart the node.
>
>
>
> When  I restart I get errors in the Yaml file pointing to the token ranges.
>
>
>
> If I take out the tokens for 2 of the nodes and just use one node(s)
> tokens I can restore the data, but I get only a third of the row count id
> expect.
>
>
>
> I then noticed I was only restoring  the snapshot from that one node, so ,
> that made sense….
>
>
>
> So then I took all of the snapshots, from all of the nodes, placed them
> into a folder, re added all the tokens and re ran the process, but I get
> the token range error in the yaml again.
>
>
>
>
>
> *Attempt 2*
>
>
>
> So then I tried  SSTableLoader from the same folder with all the (3 nodes
> snapshots) and then I get corruption on the SSTable..
>
>
>
>
>
> *Advice*
>
>
>
> I’ve tried this so many ways its getting confusing.. what’s right?
>
>
>
> Can anyone give me some pointers as to the best route to migrate data from
> cluster to cluster? The Documentation is so vague and not detailed enough?
>
>
>
> Do I copy the snapshots from all the nodes?
>
> Do I just work on one node at a time?
>
>
>
> Any suggestions please??
>
>
>
> Thanks
>
> J
>
>
>
>
>
>
>
> *Jonathan Baynes*
>
> DBA
> Tradeweb Europe Limited
>
> Moor Place  •  1 Fore Street Avenue  •  London EC2Y 9DT
> P +44 (0)20 77760988 

Move Production data to Development Cluster

2017-07-07 Thread Jonathan Baynes
Hi,

Can anyone help me. I'm trying (and failing) to move my 3 node C* data from my 
Production Environment to my Development 3 node cluster.

Here is the fine print...

Oracle Linux 7.3
C* 3.0.11

3 Nodes ((virtual Nodes 256))
1 Keyspace (replication factor 3) Quorum Consistency
1 table

Snapshot taken on each node.

Attempt 1

I've tried the following 
(http://docs.datastax.com/en/cassandra/3.0/cassandra/operations/opsSnapshotRestoreNewCluster.html)



  1.  From the old cluster, retrieve the list of tokens associated with each 
node's IP:

$ nodetool ring | grep ip_address_of_node | awk '{print $NF ","}' | xargs



I've done this for all 3 nodes, placed them together in one string

  1.  In the 
cassandra.yaml
 file for each node in the new cluster, add the list of tokens you obtained in 
the previous step to the 
initial_token
 parameter using the same num_tokens setting as in the old cluster.
Added all the tokens from step one

  1.  Make any other necessary changes in the new cluster's 
cassandra.yaml
 and property files so that the new nodes match the old cluster settings. Make 
sure the seed nodes are set for the new cluster.
  2.  Clear the system table data from each new node:

$ sudo rm -rf /var/lib/cassandra/data/system/*

This allows the new nodes to use the initial tokens defined in the 
cassandra.yaml
 when they restart.

  1.  Start each node using the specified list of token ranges in new cluster's 
cassandra.yaml:

initial_token: -9211270970129494930, -9138351317258731895, 
-8980763462514965928, ...

  1.  Create schema in the new cluster. All the schemas from the old cluster 
must be reproduced in the new cluster.
  2.  Stop the 
node.
 Using nodetool refresh is unsafe because files within the data directory of a 
running node can be silently overwritten by identically named just-flushed 
SSTables from memtable flushes or compaction. Copying files into the data 
directory and restarting the node will not work for the same reason.
  3.  Restore the SSTable files 
snapshotted
 from the old cluster onto the new cluster using the same directories, while 
noting that the UUID component of target directory names has changed. Without 
restoration, the new cluster will not have data to read upon restart.
  4.  Restart the node.

When  I restart I get errors in the Yaml file pointing to the token ranges.

If I take out the tokens for 2 of the nodes and just use one node(s) tokens I 
can restore the data, but I get only a third of the row count id expect.

I then noticed I was only restoring  the snapshot from that one node, so , that 
made sense

So then I took all of the snapshots, from all of the nodes, placed them into a 
folder, re added all the tokens and re ran the process, but I get the token 
range error in the yaml again.


Attempt 2

So then I tried  SSTableLoader from the same folder with all the (3 nodes 
snapshots) and then I get corruption on the SSTable..


Advice

I've tried this so many ways its getting confusing.. what's right?

Can anyone give me some pointers as to the best route to migrate data from 
cluster to cluster? The Documentation is so vague and not detailed enough?

Do I copy the snapshots from all the nodes?
Do I just work on one node at a time?

Any suggestions please??

Thanks
J



Jonathan Baynes
DBA
Tradeweb Europe Limited
Moor Place  *  1 Fore Street Avenue  *  London EC2Y 9DT
P +44 (0)20 77760988  *  F +44 (0)20 7776 3201  *  M +44 (0) xx
jonathan.bay...@tradeweb.com

[cid:image001.jpg@01CD26AD.4165F110]   follow us:  
[cid:image002.jpg@01CD26AD.4165F110] 

[cid:image003.jpg@01CD26AD.4165F110] 
-
A leading marketplace for electronic 
fixed income, derivatives and ETF trading




This e-mail may contain confidential and/or privileged information. If you are 
not the intended recipient (or have received this