Make an existing cluster multi data-center compatible.
Hi all, I want to add a data-center to an existing single data-center cluster. First I have to make the existing cluster multi data-center compatible. The existing cluster is a 12 node cluster with: - Replication factor = 3 - Placement strategy = SimpleStrategy - Endpoint snitch = SimpleSnitch If I change the following: - Placement strategy = NetworkTopologyStrategy - Endpoint snitch = PropertyFileSnitch - all 12 nodes in this file belong to the same data-center and rack. Do I have to run full repairs after this change? Because the yaml file states: IF YOU CHANGE THE SNITCH AFTER DATA IS INSERTED INTO THE CLUSTER, YOU MUST RUN A FULL REPAIR, SINCE THE SNITCH AFFECTS WHERE REPLICAS ARE PLACED. Thanks! Rene
Re: Make an existing cluster multi data-center compatible.
Yes, you must run a full repair for the reasons stated in the yaml file. Mark On Tue, Aug 5, 2014 at 11:52 AM, Rene Kochen rene.koc...@schange.com wrote: Hi all, I want to add a data-center to an existing single data-center cluster. First I have to make the existing cluster multi data-center compatible. The existing cluster is a 12 node cluster with: - Replication factor = 3 - Placement strategy = SimpleStrategy - Endpoint snitch = SimpleSnitch If I change the following: - Placement strategy = NetworkTopologyStrategy - Endpoint snitch = PropertyFileSnitch - all 12 nodes in this file belong to the same data-center and rack. Do I have to run full repairs after this change? Because the yaml file states: IF YOU CHANGE THE SNITCH AFTER DATA IS INSERTED INTO THE CLUSTER, YOU MUST RUN A FULL REPAIR, SINCE THE SNITCH AFFECTS WHERE REPLICAS ARE PLACED. Thanks! Rene
Re: Make an existing cluster multi data-center compatible.
What I understand is that SimpleStrategy determines the endpoints for replica's by traversing the ring clock-wise. NetworkTopologyStrategy determines the replica's by traversing the ring clock-wise and taking into account the racks and DC locations. Since the file used by PropertyFileSnitch puts all endpoints in the same data-center and rack, isn't the result of the endpoint selection basically the same? Thanks! Rene 2014-08-05 12:56 GMT+02:00 Mark Reddy mark.re...@boxever.com: Yes, you must run a full repair for the reasons stated in the yaml file. Mark On Tue, Aug 5, 2014 at 11:52 AM, Rene Kochen rene.koc...@schange.com wrote: Hi all, I want to add a data-center to an existing single data-center cluster. First I have to make the existing cluster multi data-center compatible. The existing cluster is a 12 node cluster with: - Replication factor = 3 - Placement strategy = SimpleStrategy - Endpoint snitch = SimpleSnitch If I change the following: - Placement strategy = NetworkTopologyStrategy - Endpoint snitch = PropertyFileSnitch - all 12 nodes in this file belong to the same data-center and rack. Do I have to run full repairs after this change? Because the yaml file states: IF YOU CHANGE THE SNITCH AFTER DATA IS INSERTED INTO THE CLUSTER, YOU MUST RUN A FULL REPAIR, SINCE THE SNITCH AFFECTS WHERE REPLICAS ARE PLACED. Thanks! Rene
Re: Make an existing cluster multi data-center compatible.
On Tue, Aug 5, 2014 at 3:52 AM, Rene Kochen rene.koc...@schange.com wrote: Do I have to run full repairs after this change? Because the yaml file states: IF YOU CHANGE THE SNITCH AFTER DATA IS INSERTED INTO THE CLUSTER, YOU MUST RUN A FULL REPAIR, SINCE THE SNITCH AFFECTS WHERE REPLICAS ARE PLACED. As long as you correctly configure the new snitch so that the replica sets do not change, no, you do not need to repair. Barring that, if you manage to transform the replica set in such a way that you always have one (fully repaired) replica from the old set, repair will help. I do not recommend this very risky practice. In practice the only transformation of snitch in a cluster with data which is likely to be safe is one whose result is a NOOP in terms of replica placement. In fact, the yaml file is stating something unreasonable there, because repair cannot protect against this case : - 6 node cluster, A B C D E F, RF = 2 1) Start with SimpleSnitch so that A, B have the two replicas of row key X. 2) Write row key X, value Y, to nodes A and B. 2) Change to OtherSnitch so that now C,D are responsible for row key X. 3) Repair and notice that neither C nor D answer Y when asked for row X. =Rob
Re: Make an existing cluster multi data-center compatible.
As long as you correctly configure the new snitch so that the replica sets do not change, no, you do not need to repair. Is the following correct: The replica sets do not change if you modify the snitch from SimpleSnitch to NetworkTopologyStrategy and the topology file puts all nodes in the same data-center and rack. Thanks again! Rene 2014-08-05 20:05 GMT+02:00 Robert Coli rc...@eventbrite.com: On Tue, Aug 5, 2014 at 3:52 AM, Rene Kochen rene.koc...@schange.com wrote: Do I have to run full repairs after this change? Because the yaml file states: IF YOU CHANGE THE SNITCH AFTER DATA IS INSERTED INTO THE CLUSTER, YOU MUST RUN A FULL REPAIR, SINCE THE SNITCH AFFECTS WHERE REPLICAS ARE PLACED. As long as you correctly configure the new snitch so that the replica sets do not change, no, you do not need to repair. Barring that, if you manage to transform the replica set in such a way that you always have one (fully repaired) replica from the old set, repair will help. I do not recommend this very risky practice. In practice the only transformation of snitch in a cluster with data which is likely to be safe is one whose result is a NOOP in terms of replica placement. In fact, the yaml file is stating something unreasonable there, because repair cannot protect against this case : - 6 node cluster, A B C D E F, RF = 2 1) Start with SimpleSnitch so that A, B have the two replicas of row key X. 2) Write row key X, value Y, to nodes A and B. 2) Change to OtherSnitch so that now C,D are responsible for row key X. 3) Repair and notice that neither C nor D answer Y when asked for row X. =Rob
Re: Make an existing cluster multi data-center compatible.
On Tue, Aug 5, 2014 at 2:27 PM, Rene Kochen rene.koc...@schange.com wrote: As long as you correctly configure the new snitch so that the replica sets do not change, no, you do not need to repair. Is the following correct: The replica sets do not change if you modify the snitch from SimpleSnitch to NetworkTopologyStrategy and the topology file puts all nodes in the same data-center and rack. Yes, you can use nodetool getendpoints to illustrate this programatically. 1) make a set of keys with a key from each range 2) getendpoints for this set of keys 3) change snitch 4) getendpoints again =Rob
Re: Make an existing cluster multi data-center compatible.
I think the RAC placement of these 12 nodes will become important. As the 12 nodes are placed in SimpleSnitch, which is not RAC aware, it would be good to retain them in single RAC in the property file snitch also initially. node repair is a safe option. If you need to change the RAC placement, my take would be to increase the Replication factor to atleast 3 and then distribute the nodes in different RAC. This is not an expert opinion but a newbie thought. Regards, Rameez On Tue, Aug 5, 2014 at 11:35 PM, Robert Coli rc...@eventbrite.com wrote: On Tue, Aug 5, 2014 at 3:52 AM, Rene Kochen rene.koc...@schange.com wrote: Do I have to run full repairs after this change? Because the yaml file states: IF YOU CHANGE THE SNITCH AFTER DATA IS INSERTED INTO THE CLUSTER, YOU MUST RUN A FULL REPAIR, SINCE THE SNITCH AFFECTS WHERE REPLICAS ARE PLACED. As long as you correctly configure the new snitch so that the replica sets do not change, no, you do not need to repair. Barring that, if you manage to transform the replica set in such a way that you always have one (fully repaired) replica from the old set, repair will help. I do not recommend this very risky practice. In practice the only transformation of snitch in a cluster with data which is likely to be safe is one whose result is a NOOP in terms of replica placement. In fact, the yaml file is stating something unreasonable there, because repair cannot protect against this case : - 6 node cluster, A B C D E F, RF = 2 1) Start with SimpleSnitch so that A, B have the two replicas of row key X. 2) Write row key X, value Y, to nodes A and B. 2) Change to OtherSnitch so that now C,D are responsible for row key X. 3) Repair and notice that neither C nor D answer Y when asked for row X. =Rob