> Since replication factor is 2 in first cluster, I
> won't lose any data.
Assuming you have been running repair or working at CL QUORUM (which is the 
same as CL ALL for RF 2)

> Is it advisable and safe to go ahead?
um, so the plan is to turn off 2 nodes in the first cluster, restask them into 
the new cluster and then reverse the process ?

If you simply turn two nodes off in the first cluster you will have reduce the 
availability for a portion of the ring. 25% of the keys will now have at best 1 
node they can be stored on. If a node is having any sort of problems, and it's 
is a replica for one of the down nodes, the cluster will appear down for 12.5% 
of the keyspace.

If you work at QUORUM you will not have enough nodes available to write / read 
25% of the keys. 

If you decomission the nodes, you will still have 2 replicas available for each 
key range. This is the path I would recommend.

If you _really_ need to do it what you suggest will probably work. Some tips:

* do safe shutdowns - nodetool disablegossip, disablethrift, drain
* don't forget to copy the yaml file. 
* in the first cluster the other nodes will collect hints for the first hour 
the nodes are down. You are not going to want these so disable HH. 
* get the nodes back into the first cluster before gc_grace_seconds expires. 
* bring them back and repair them.
* when you bring them back, reading at CL ONE will give inconsistent results. 
Reading at QUOURM may result in a lot of repair activity.

Hope that helps. 
 
-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 11/07/2012, at 6:35 AM, rohit bhatia wrote:

> Hi
> 
> I want to take out 2 nodes from a 8 node cluster and use in another
> cluster, but can't afford the overhead of streaming the data and
> rebalance cluster. Since replication factor is 2 in first cluster, I
> won't lose any data.
> 
> I'm planning to save my commit_log and data directories and
> bootstrapping the node in the second cluster. Afterwards I'll just
> replace both the directories and join the node back to the original
> cluster.  This should work since cassandra saves all the cluster and
> schema info in the system keyspace.
> 
> Is it advisable and safe to go ahead?
> 
> Thanks
> Rohit

Reply via email to