RE: Problems after trying a migration
Hi Jan, Thank you for your help, we’ll see during next week. Have a nice day. Best regards, [cid:image001.png@01D062FA.DDD7FC50] David CHARBONNIER Sysadmin T : +33 411 934 200 david.charbonn...@rgsystem.commailto:david.charbonn...@rgsystem.com ZAC Aéroport 125 Impasse Adam Smith 34470 Pérols - France www.rgsystem.comhttp://www.rgsystem.com/ [cid:image002.png@01D062FA.DDD7FC50] De : Jan [mailto:cne...@yahoo.com] Envoyé : jeudi 19 mars 2015 05:09 À : user@cassandra.apache.org Objet : Re: Problems after trying a migration Hi David; some input to get back to where you were : a)Start with the French cluster only and get it working with DSE 4.5.1 b)Opscenter keyspace is by default RF1; alter the keyspace to RF3 c)Take a full snapshot of all your nodes copy the files to a safe location on all the nodes To migrate the data into new cluster: a)Use the same version DSE 4.5.1 in Luxembourg bring up 1 node at a time.Check that the node has comeup in the new Datacenter. b)Bring up new nodes into the new Datacenter one at a time c)After all your new nodes are UP in Luxembourg, conduct a 'nodetool repair -parallel' d) Check in OpsCenter that you have all your nodes showing up (new and old) e)Start taking down your nodes in France, one at a time f) After all the nodes in France are down, conduct a 'nodetool repair -parallel' again g)Upgrade the nodes in Luxembourg to DSE 4.6.1 h) conduct a 'nodetool repair -parallel' again i) Upgrade to OpsCenter 5.1 Best of luck, hope this helps. Jan/ On Wednesday, March 18, 2015 1:01 PM, Robert Coli rc...@eventbrite.commailto:rc...@eventbrite.com wrote: On Wed, Mar 18, 2015 at 9:05 AM, David CHARBONNIER david.charbonn...@rgsystem.commailto:david.charbonn...@rgsystem.com wrote: - New nodes in the other country have been installed like French nodes except for Datastax Enterprise version (4.5.1 in France and 4.6.1 in the other country which means Cassandra version 2.0.8.39 in France and 2.0.12.200 in the other country) This is officially unsupported, and might cause of problems during this process. =Rob
RE: Problems after trying a migration
Hi Fabien, Thank you for the link ! That’s exactly what we want to do. But before starting this, we need to clean up the mess in order to get a clean cluster. Thanks for your help. Best regards, [cid:image001.png@01D061A4.2E073720] David CHARBONNIER Sysadmin T : +33 411 934 200 david.charbonn...@rgsystem.commailto:david.charbonn...@rgsystem.com ZAC Aéroport 125 Impasse Adam Smith 34470 Pérols - France www.rgsystem.comhttp://www.rgsystem.com/ [cid:image004.png@01D061A4.2E073720] De : Fabien Rousseau [mailto:fab...@yakaz.com] Envoyé : mercredi 18 mars 2015 17:32 À : user Objet : Re: Problems after trying a migration Hi David, There is an excellent article which describes exactly what you want to do (ie migrate from one DC to another DC) : http://planetcassandra.org/blog/cassandra-migration-to-ec2/ 2015-03-18 17:05 GMT+01:00 David CHARBONNIER david.charbonn...@rgsystem.commailto:david.charbonn...@rgsystem.com: Hi, We’re using Cassandra through the Datastax Enterprise package in version 4.5.1 (Cassandra version 2.0.8.39) with 7 nodes in a single datacenter. We need to move our Cassandra cluster from France to another country. To do this, we want to add a second 7-nodes datacenter to our cluster and stream all data between the two countries before dropping the first datacenter. On January 31st, we tried doing so but we had some problems: - New nodes in the other country have been installed like French nodes except for Datastax Enterprise version (4.5.1 in France and 4.6.1 in the other country which means Cassandra version 2.0.8.39 in France and 2.0.12.200 in the other country) - The following procedure has been followed: http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_add_dc_to_cluster_t.html but an error occurred during step 3. New nodes have been started before the cassandra-topology.properties file has been updated on the original datacenter. New nodes appeared in the original datacenter instead of the new one. - To recover our original cluster, we decommissionned every node of the new datacenter with the nodetool decommission command. On February 9th, nodes in the second datacenter have been restarted and joined the cluster. We had to decommission them just like before. On February 11th, we added disk space on our 7 running French nodes. To achieve this, we restarted the cluster but the nodes updated their perring informations and nodes from Luxembourg (decommissionned on February 9th) were present. This behaviour is described here: https://issues.apache.org/jira/browse/CASSANDRA-7825. So we cleaned system.peers table content. On March 11th, we needed to add an 8th node to our existing French cluster. We installed the same Datastax Enterprise version (4.5.1 with Cassandra 2.0.8.39) and tried to add this node to the cluster with this procedure: http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_add_node_to_cluster_t.html. In OPSCenter, the node was joining the cluster and data streaming got stuck at 100%. After several hours, nodetool status showed us that the node was still joining but nothing in the logs let us know there was a problem. We restarted the node but it has no effect. Then we cleaned data and commitlog contents and try to add the node to the cluster again but without result. Last try was to add the node with auto_bootstrap : false in order to add the node to the cluster manually but it messed up with the data. So we shut down the node and decommissioned it (with nodetool removenode). The whole cluster has been repaired and we stopped doing anything. Now, our cluster has only 7 French nodes in which we can’t add any node. The OPSCenter data has disapeared and we work without any information about how our cluster is running. You’ll find attached to this email our current configuration and a screenshot of our OPSCenter metric page. Do you have some idea on how to clean up the mess and get our cluster running cleanly before we start our migration (France to another country like described in the beginning of this email)? Thank you. Best regards, [cid:image001.png@01D061A4.2E073720] David CHARBONNIER Sysadmin T : +33 411 934 200 david.charbonn...@rgsystem.commailto:david.charbonn...@rgsystem.com ZAC Aéroport 125 Impasse Adam Smith 34470 Pérols - France www.rgsystem.comhttp://www.rgsystem.com/ [cid:image004.png@01D061A4.2E073720] -- Fabien Rousseau [http://www.yakaz.com/img/logo_yakaz_small.png] www.yakaz.comhttp://www.yakaz.com/
Re: Problems after trying a migration
Hi David, There is an excellent article which describes exactly what you want to do (ie migrate from one DC to another DC) : http://planetcassandra.org/blog/cassandra-migration-to-ec2/ 2015-03-18 17:05 GMT+01:00 David CHARBONNIER david.charbonn...@rgsystem.com : Hi, We’re using Cassandra through the Datastax Enterprise package in version 4.5.1 (Cassandra version 2.0.8.39) with 7 nodes in a single datacenter. We need to move our Cassandra cluster from France to another country. To do this, we want to add a second 7-nodes datacenter to our cluster and stream all data between the two countries before dropping the first datacenter. On January 31st, we tried doing so but we had some problems: - New nodes in the other country have been installed like French nodes except for Datastax Enterprise version (4.5.1 in France and 4.6.1 in the other country which means Cassandra version 2.0.8.39 in France and 2.0.12.200 in the other country) - The following procedure has been followed: http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_add_dc_to_cluster_t.html but an error occurred during step 3. New nodes have been started before the *cassandra-topology.properties* file has been updated on the original datacenter. New nodes appeared in the original datacenter instead of the new one. - To recover our original cluster, we decommissionned every node of the new datacenter with the *nodetool decommission* command. On February 9th, nodes in the second datacenter have been restarted and joined the cluster. We had to decommission them just like before. On February 11th, we added disk space on our 7 running French nodes. To achieve this, we restarted the cluster but the nodes updated their perring informations and nodes from Luxembourg (decommissionned on February 9th) were present. This behaviour is described here: https://issues.apache.org/jira/browse/CASSANDRA-7825. So we cleaned *system.peers* table content. On March 11th, we needed to add an 8th node to our existing French cluster. We installed the same Datastax Enterprise version (4.5.1 with Cassandra 2.0.8.39) and tried to add this node to the cluster with this procedure: http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_add_node_to_cluster_t.html. In OPSCenter, the node was joining the cluster and data streaming got stuck at 100%. After several hours, *nodetool status* showed us that the node was still joining but nothing in the logs let us know there was a problem. We restarted the node but it has no effect. Then we cleaned data and commitlog contents and try to add the node to the cluster again but without result. Last try was to add the node with *auto_bootstrap : false* in order to add the node to the cluster manually but it messed up with the data. So we shut down the node and decommissioned it (with *nodetool removenode*). The whole cluster has been repaired and we stopped doing anything. Now, our cluster has only 7 French nodes in which we can’t add any node. The OPSCenter data has disapeared and we work without any information about how our cluster is running. You’ll find attached to this email our current configuration and a screenshot of our OPSCenter metric page. Do you have some idea on how to clean up the mess and get our cluster running cleanly before we start our migration (France to another country like described in the beginning of this email)? Thank you. Best regards, *David CHARBONNIER* Sysadmin T : +33 411 934 200 david.charbonn...@rgsystem.com ZAC Aéroport 125 Impasse Adam Smith 34470 Pérols - France *www.rgsystem.com* http://www.rgsystem.com/ -- Fabien Rousseau aur...@yakaz.comwww.yakaz.com
Re: Problems after trying a migration
Hi David; some input to get back to where you were : a) Start with the French cluster only and get it working with DSE 4.5.1 b) Opscenter keyspace is by default RF1; alter the keyspace to RF3 c) Take a full snapshot of all your nodes copy the files to a safe location on all the nodes To migrate the data into new cluster: a) Use the same version DSE 4.5.1 in Luxembourg bring up 1 node at a time. Check that the node has comeup in the new Datacenter.b) Bring up new nodes into the new Datacenter one at a timec) After all your new nodes are UP in Luxembourg, conduct a 'nodetool repair -parallel' d) Check in OpsCenter that you have all your nodes showing up (new and old)e) Start taking down your nodes in France, one at a timef) After all the nodes in France are down, conduct a 'nodetool repair -parallel' again g) Upgrade the nodes in Luxembourg to DSE 4.6.1 h) conduct a 'nodetool repair -parallel' again i) Upgrade to OpsCenter 5.1 Best of luck, hope this helps. Jan/ On Wednesday, March 18, 2015 1:01 PM, Robert Coli rc...@eventbrite.com wrote: On Wed, Mar 18, 2015 at 9:05 AM, David CHARBONNIER david.charbonn...@rgsystem.com wrote: - New nodes in the other country have been installed like French nodes except for Datastax Enterprise version (4.5.1 in France and 4.6.1 in the other country which means Cassandra version 2.0.8.39 in France and 2.0.12.200 in the other country) This is officially unsupported, and might cause of problems during this process. =Rob
Re: Problems after trying a migration
On Wed, Mar 18, 2015 at 12:58 PM, Robert Coli rc...@eventbrite.com wrote: On Wed, Mar 18, 2015 at 9:05 AM, David CHARBONNIER david.charbonn...@rgsystem.com wrote: - New nodes in the other country have been installed like French nodes except for Datastax Enterprise version (4.5.1 in France and 4.6.1 in the other country which means Cassandra version 2.0.8.39 in France and 2.0.12.200 in the other country) This is officially unsupported, and might cause of problems during this process. As regards your other situation, I suggest joining #cassandra and pointing people there towards your summary and interactively discussing it with them. Mailing list lag is not best for operational issues. :) =Rob