subject:"Move Production data to Development Cluster"

Re: Move Production data to Development Cluster

2017-07-07 Thread Pranay akula

Hello *Jonathan,*

As both clusters size is same.

 Do I copy the snapshots from all the nodes?  yes this will work, just make
sure that ur copying data to nodes with assiociated tokens.


Thanks
Pranay.

On Fri, Jul 7, 2017 at 10:48 AM, Jonathan Baynes <
jonathan.bay...@tradeweb.com> wrote:

> Hi,
>
>
>
> Can anyone help me. I’m trying (and failing) to move my 3 node C* data
> from my Production Environment to my Development 3 node cluster.
>
>
>
> Here is the fine print…
>
>
>
> Oracle Linux 7.3
>
> C* 3.0.11
>
>
>
> 3 Nodes ((virtual Nodes 256))
>
> 1 Keyspace (replication factor 3) Quorum Consistency
>
> 1 table
>
>
>
> Snapshot taken on each node.
>
>
>
> *Attempt 1*
>
>
>
> I’ve tried the following (http://docs.datastax.com/en/
> cassandra/3.0/cassandra/operations/opsSnapshotRestoreNewCluster.html)
>
>
>
>
>
>1. From the old cluster, retrieve the list of tokens associated with
>each node's IP:
>
> $ nodetool ring | grep ip_address_of_node | awk '{print $NF ","}' | xargs
>
>
>
> I’ve done this for all 3 nodes, placed them together in one string
>
>
>1. In the cassandra.yaml
>
> 
>  file
>for each node in the new cluster, add the list of tokens you obtained in
>the previous step to the initial_token
>
> 
>  parameter
>using the same *num_tokens* setting as in the old cluster.
>
> Added all the tokens from step one
>
>1. Make any other necessary changes in the new cluster's cassandra.yaml
>
> 
>  and
>property files so that the new nodes match the old cluster settings. Make
>sure the seed nodes are set for the new cluster.
>2. Clear the system table data from each new node:
>
> $ sudo rm -rf /var/lib/cassandra/data/system/*
>
> This allows the new nodes to use the initial tokens defined in the
> cassandra.yaml
> 
>  when
> they restart.
>
>1. Start each node using the specified list of token ranges in new
>cluster's cassandra.yaml
>
> 
>:
>
> initial_token: -9211270970129494930, -9138351317258731895, 
> -8980763462514965928, ...
>
>
>1. Create schema in the new cluster. All the schemas from the old
>cluster must be reproduced in the new cluster.
>2. Stop the node
>
> .
>Using nodetool refresh is unsafe because files within the data
>directory of a running node can be silently overwritten by identically
>named just-flushed SSTables from memtable flushes or compaction. Copying
>files into the data directory and restarting the node will not work for the
>same reason.
>3. Restore the SSTable files snapshotted
>
> 
>  from
>the old cluster onto the new cluster using the same directories, while
>noting that the UUID component of target directory names has changed.
>Without restoration, the new cluster will not have data to read upon
>restart.
>4. Restart the node.
>
>
>
> When  I restart I get errors in the Yaml file pointing to the token ranges.
>
>
>
> If I take out the tokens for 2 of the nodes and just use one node(s)
> tokens I can restore the data, but I get only a third of the row count id
> expect.
>
>
>
> I then noticed I was only restoring  the snapshot from that one node, so ,
> that made sense….
>
>
>
> So then I took all of the snapshots, from all of the nodes, placed them
> into a folder, re added all the tokens and re ran the process, but I get
> the token range error in the yaml again.
>
>
>
>
>
> *Attempt 2*
>
>
>
> So then I tried  SSTableLoader from the same folder with all the (3 nodes
> snapshots) and then I get corruption on the SSTable..
>
>
>
>
>
> *Advice*
>
>
>
> I’ve tried this so many ways its getting confusing.. what’s right?
>
>
>
> Can anyone give me some pointers as to the best route to migrate data from
> cluster to cluster? The Documentation is so vague and not detailed enough?
>
>
>
> Do I copy the snapshots from all the nodes?
>
> Do I just work on one node at a time?
>
>
>
> Any suggestions please??
>
>
>
> Thanks
>
> J
>
>
>
>
>
>
>
> *Jonathan Baynes*
>
> DBA
> Tradeweb Europe Limited
>
> Moor Place  •  1 Fore Street Avenue  •  London EC2Y 9DT
> P +44 (0)20 77760988

Move Production data to Development Cluster

2017-07-07 Thread Jonathan Baynes

Hi,

Can anyone help me. I'm trying (and failing) to move my 3 node C* data from my
Production Environment to my Development 3 node cluster.

Here is the fine print...

Oracle Linux 7.3
C* 3.0.11

3 Nodes ((virtual Nodes 256))
1 Keyspace (replication factor 3) Quorum Consistency
1 table

Snapshot taken on each node.

Attempt 1

I've tried the following
(http://docs.datastax.com/en/cassandra/3.0/cassandra/operations/opsSnapshotRestoreNewCluster.html)

1. From the old cluster, retrieve the list of tokens associated with each
node's IP:

$ nodetool ring | grep ip_address_of_node | awk '{print $NF ","}' | xargs

I've done this for all 3 nodes, placed them together in one string

1. In the
cassandra.yaml
file for each node in the new cluster, add the list of tokens you obtained in
the previous step to the
initial_token
parameter using the same num_tokens setting as in the old cluster.
Added all the tokens from step one

1. Make any other necessary changes in the new cluster's
cassandra.yaml
and property files so that the new nodes match the old cluster settings. Make
sure the seed nodes are set for the new cluster.
2. Clear the system table data from each new node:

$ sudo rm -rf /var/lib/cassandra/data/system/*

This allows the new nodes to use the initial tokens defined in the
cassandra.yaml
when they restart.

1. Start each node using the specified list of token ranges in new cluster's
cassandra.yaml:

initial_token: -9211270970129494930, -9138351317258731895,
-8980763462514965928, ...

1. Create schema in the new cluster. All the schemas from the old cluster
must be reproduced in the new cluster.
2. Stop the
node.
Using nodetool refresh is unsafe because files within the data directory of a
running node can be silently overwritten by identically named just-flushed
SSTables from memtable flushes or compaction. Copying files into the data
directory and restarting the node will not work for the same reason.
3. Restore the SSTable files
snapshotted
from the old cluster onto the new cluster using the same directories, while
noting that the UUID component of target directory names has changed. Without
restoration, the new cluster will not have data to read upon restart.
4. Restart the node.

When I restart I get errors in the Yaml file pointing to the token ranges.

If I take out the tokens for 2 of the nodes and just use one node(s) tokens I
can restore the data, but I get only a third of the row count id expect.

I then noticed I was only restoring the snapshot from that one node, so , that
made sense

So then I took all of the snapshots, from all of the nodes, placed them into a
folder, re added all the tokens and re ran the process, but I get the token
range error in the yaml again.

Attempt 2

So then I tried SSTableLoader from the same folder with all the (3 nodes
snapshots) and then I get corruption on the SSTable..

Advice

I've tried this so many ways its getting confusing.. what's right?

Can anyone give me some pointers as to the best route to migrate data from
cluster to cluster? The Documentation is so vague and not detailed enough?

Do I copy the snapshots from all the nodes?
Do I just work on one node at a time?

Any suggestions please??

Thanks
J

Jonathan Baynes
DBA
Tradeweb Europe Limited
Moor Place * 1 Fore Street Avenue * London EC2Y 9DT
P +44 (0)20 77760988 * F +44 (0)20 7776 3201 * M +44 (0) xx
jonathan.bay...@tradeweb.com

[cid:image001.jpg@01CD26AD.4165F110] follow us:
[cid:image002.jpg@01CD26AD.4165F110]

[cid:image003.jpg@01CD26AD.4165F110]
-
A leading marketplace for electronic
fixed income, derivatives and ETF trading

This e-mail may contain confidential and/or privileged information. If you are
not the intended recipient (or have received this

Re: Move Production data to Development Cluster

Move Production data to Development Cluster

2 matches

Site Navigation

Mail list logo

Footer information