Re: Adding nodes to existing cluster
Start one node at a time. Wait 2 minutes before starting each node. How much data and nodes you have already? Depending on that, the streaming of data can stress on the resources you have. I would recommend to start one and monitor, if things are ok, add another one. And so on. Regards, Carlos Juzarte Rolo Cassandra Consultant Pythian - Love your data rolo@pythian | Twitter: cjrolo | Linkedin: *linkedin.com/in/carlosjuzarterolo http://linkedin.com/in/carlosjuzarterolo* Mobile: +31 6 159 61 814 | Tel: +1 613 565 8696 x1649 www.pythian.com On Mon, Apr 20, 2015 at 11:02 AM, Or Sher or.sh...@gmail.com wrote: Hi all, In the near future I'll need to add more than 10 nodes to a 2.0.9 cluster (using vnodes). I read this documentation on datastax website: http://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_add_node_to_cluster_t.html In one point it says: If you are using racks, you can safely bootstrap two nodes at a time when both nodes are on the same rack. And in another is says: Start Cassandra on each new node. Allow two minutes between node initializations. You can monitor the startup and data streaming process using nodetool netstats. We're not using racks configuration and from reading this documentation I'm not really sure is it safe for us to bootstrap all nodes together (with two minutes between each other). I really hate the tought of doing it one by one, I assume it will take more than 6H per node. What do you say? -- Or Sher -- --
Re: Adding nodes to existing cluster
unsubscribe On Apr 20, 2015, at 8:08 AM, Carlos Rolo r...@pythian.com wrote: Independent of the snitch, data needs to travel to the new nodes (plus all the keyspace information that goes via gossip). So I won't bootstrap them all at once, even if it is only for network traffic generated. Don't forget to run cleanup on the old nodes once all nodes are in place to reclaim disk space. Regards, Carlos Juzarte Rolo Cassandra Consultant Pythian - Love your data rolo@pythian | Twitter: cjrolo | Linkedin: linkedin.com/in/carlosjuzarterolo http://linkedin.com/in/carlosjuzarterolo Mobile: +31 6 159 61 814 | Tel: +1 613 565 8696 x1649 www.pythian.com http://www.pythian.com/ On Mon, Apr 20, 2015 at 1:58 PM, Or Sher or.sh...@gmail.com mailto:or.sh...@gmail.com wrote: Thanks for the response. Sure we'll monitor as we're adding nodes. We're now using 6 nodes on each DC. (We have 2 DCs) Each node contains ~800GB Do you know how rack configurations are relevant here? Do you see any reason to bootstrap them one by one if we're not using rack awareness? On Mon, Apr 20, 2015 at 2:49 PM, Carlos Rolo r...@pythian.com mailto:r...@pythian.com wrote: Start one node at a time. Wait 2 minutes before starting each node. How much data and nodes you have already? Depending on that, the streaming of data can stress on the resources you have. I would recommend to start one and monitor, if things are ok, add another one. And so on. Regards, Carlos Juzarte Rolo Cassandra Consultant Pythian - Love your data rolo@pythian | Twitter: cjrolo | Linkedin: linkedin.com/in/carlosjuzarterolo http://linkedin.com/in/carlosjuzarterolo Mobile: +31 6 159 61 814 tel:%2B31%206%20159%2061%20814 | Tel: +1 613 565 8696 x1649 tel:%2B1%20613%20565%208696%20x1649 www.pythian.com http://www.pythian.com/ On Mon, Apr 20, 2015 at 11:02 AM, Or Sher or.sh...@gmail.com mailto:or.sh...@gmail.com wrote: Hi all, In the near future I'll need to add more than 10 nodes to a 2.0.9 cluster (using vnodes). I read this documentation on datastax website: http://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_add_node_to_cluster_t.html http://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_add_node_to_cluster_t.html In one point it says: If you are using racks, you can safely bootstrap two nodes at a time when both nodes are on the same rack. And in another is says: Start Cassandra on each new node. Allow two minutes between node initializations. You can monitor the startup and data streaming process using nodetool netstats. We're not using racks configuration and from reading this documentation I'm not really sure is it safe for us to bootstrap all nodes together (with two minutes between each other). I really hate the tought of doing it one by one, I assume it will take more than 6H per node. What do you say? -- Or Sher -- -- Or Sher -- smime.p7s Description: S/MIME cryptographic signature
RE: Adding nodes to existing cluster
Hi Colin, To remove your address from the list, send a message to: user-unsubscr...@cassandra.apache.org Cheers, Matt *From:* Colin Clark [mailto:co...@clark.ws] *Sent:* 20 April 2015 14:10 *To:* user@cassandra.apache.org *Subject:* Re: Adding nodes to existing cluster unsubscribe On Apr 20, 2015, at 8:08 AM, Carlos Rolo r...@pythian.com wrote: Independent of the snitch, data needs to travel to the new nodes (plus all the keyspace information that goes via gossip). So I won't bootstrap them all at once, even if it is only for network traffic generated. Don't forget to run cleanup on the old nodes once all nodes are in place to reclaim disk space. Regards, Carlos Juzarte Rolo Cassandra Consultant Pythian - Love your data rolo@pythian | Twitter: cjrolo | Linkedin: *linkedin.com/in/carlosjuzarterolo http://linkedin.com/in/carlosjuzarterolo* Mobile: +31 6 159 61 814 | Tel: +1 613 565 8696 x1649 www.pythian.com On Mon, Apr 20, 2015 at 1:58 PM, Or Sher or.sh...@gmail.com wrote: Thanks for the response. Sure we'll monitor as we're adding nodes. We're now using 6 nodes on each DC. (We have 2 DCs) Each node contains ~800GB Do you know how rack configurations are relevant here? Do you see any reason to bootstrap them one by one if we're not using rack awareness? On Mon, Apr 20, 2015 at 2:49 PM, Carlos Rolo r...@pythian.com wrote: Start one node at a time. Wait 2 minutes before starting each node. How much data and nodes you have already? Depending on that, the streaming of data can stress on the resources you have. I would recommend to start one and monitor, if things are ok, add another one. And so on. Regards, Carlos Juzarte Rolo Cassandra Consultant Pythian - Love your data rolo@pythian | Twitter: cjrolo | Linkedin: linkedin.com/in/carlosjuzarterolo Mobile: +31 6 159 61 814 | Tel: +1 613 565 8696 x1649 www.pythian.com On Mon, Apr 20, 2015 at 11:02 AM, Or Sher or.sh...@gmail.com wrote: Hi all, In the near future I'll need to add more than 10 nodes to a 2.0.9 cluster (using vnodes). I read this documentation on datastax website: http://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_add_node_to_cluster_t.html In one point it says: If you are using racks, you can safely bootstrap two nodes at a time when both nodes are on the same rack. And in another is says: Start Cassandra on each new node. Allow two minutes between node initializations. You can monitor the startup and data streaming process using nodetool netstats. We're not using racks configuration and from reading this documentation I'm not really sure is it safe for us to bootstrap all nodes together (with two minutes between each other). I really hate the tought of doing it one by one, I assume it will take more than 6H per node. What do you say? -- Or Sher -- -- Or Sher --
Re: Adding nodes to existing cluster
Thanks for the response. Sure we'll monitor as we're adding nodes. We're now using 6 nodes on each DC. (We have 2 DCs) Each node contains ~800GB Do you know how rack configurations are relevant here? Do you see any reason to bootstrap them one by one if we're not using rack awareness? On Mon, Apr 20, 2015 at 2:49 PM, Carlos Rolo r...@pythian.com wrote: Start one node at a time. Wait 2 minutes before starting each node. How much data and nodes you have already? Depending on that, the streaming of data can stress on the resources you have. I would recommend to start one and monitor, if things are ok, add another one. And so on. Regards, Carlos Juzarte Rolo Cassandra Consultant Pythian - Love your data rolo@pythian | Twitter: cjrolo | Linkedin: linkedin.com/in/carlosjuzarterolo Mobile: +31 6 159 61 814 | Tel: +1 613 565 8696 x1649 www.pythian.com On Mon, Apr 20, 2015 at 11:02 AM, Or Sher or.sh...@gmail.com wrote: Hi all, In the near future I'll need to add more than 10 nodes to a 2.0.9 cluster (using vnodes). I read this documentation on datastax website: http://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_add_node_to_cluster_t.html In one point it says: If you are using racks, you can safely bootstrap two nodes at a time when both nodes are on the same rack. And in another is says: Start Cassandra on each new node. Allow two minutes between node initializations. You can monitor the startup and data streaming process using nodetool netstats. We're not using racks configuration and from reading this documentation I'm not really sure is it safe for us to bootstrap all nodes together (with two minutes between each other). I really hate the tought of doing it one by one, I assume it will take more than 6H per node. What do you say? -- Or Sher -- -- Or Sher
Re: Adding nodes to existing cluster
Independent of the snitch, data needs to travel to the new nodes (plus all the keyspace information that goes via gossip). So I won't bootstrap them all at once, even if it is only for network traffic generated. Don't forget to run cleanup on the old nodes once all nodes are in place to reclaim disk space. Regards, Carlos Juzarte Rolo Cassandra Consultant Pythian - Love your data rolo@pythian | Twitter: cjrolo | Linkedin: *linkedin.com/in/carlosjuzarterolo http://linkedin.com/in/carlosjuzarterolo* Mobile: +31 6 159 61 814 | Tel: +1 613 565 8696 x1649 www.pythian.com On Mon, Apr 20, 2015 at 1:58 PM, Or Sher or.sh...@gmail.com wrote: Thanks for the response. Sure we'll monitor as we're adding nodes. We're now using 6 nodes on each DC. (We have 2 DCs) Each node contains ~800GB Do you know how rack configurations are relevant here? Do you see any reason to bootstrap them one by one if we're not using rack awareness? On Mon, Apr 20, 2015 at 2:49 PM, Carlos Rolo r...@pythian.com wrote: Start one node at a time. Wait 2 minutes before starting each node. How much data and nodes you have already? Depending on that, the streaming of data can stress on the resources you have. I would recommend to start one and monitor, if things are ok, add another one. And so on. Regards, Carlos Juzarte Rolo Cassandra Consultant Pythian - Love your data rolo@pythian | Twitter: cjrolo | Linkedin: linkedin.com/in/carlosjuzarterolo Mobile: +31 6 159 61 814 | Tel: +1 613 565 8696 x1649 www.pythian.com On Mon, Apr 20, 2015 at 11:02 AM, Or Sher or.sh...@gmail.com wrote: Hi all, In the near future I'll need to add more than 10 nodes to a 2.0.9 cluster (using vnodes). I read this documentation on datastax website: http://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_add_node_to_cluster_t.html In one point it says: If you are using racks, you can safely bootstrap two nodes at a time when both nodes are on the same rack. And in another is says: Start Cassandra on each new node. Allow two minutes between node initializations. You can monitor the startup and data streaming process using nodetool netstats. We're not using racks configuration and from reading this documentation I'm not really sure is it safe for us to bootstrap all nodes together (with two minutes between each other). I really hate the tought of doing it one by one, I assume it will take more than 6H per node. What do you say? -- Or Sher -- -- Or Sher -- --
Re: Adding nodes to existing cluster
The documentation is referring to Consistent Range Movements. There is a change in 2.1 that won't allow you to bootstrap multiple nodes at the same time unless you explicitly turn off consistent range movements. Check out the jira: https://issues.apache.org/jira/browse/CASSANDRA-2434 All the best, [image: datastax_logo.png] http://www.datastax.com/ Sebastián Estévez Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com [image: linkedin.png] https://www.linkedin.com/company/datastax [image: facebook.png] https://www.facebook.com/datastax [image: twitter.png] https://twitter.com/datastax [image: g+.png] https://plus.google.com/+Datastax/about http://feeds.feedburner.com/datastax http://cassandrasummit-datastax.com/ DataStax is the fastest, most scalable distributed database technology, delivering Apache Cassandra to the world’s most innovative enterprises. Datastax is built to be agile, always-on, and predictably scalable to any size. With more than 500 customers in 45 countries, DataStax is the database technology and transactional backbone of choice for the worlds most innovative companies such as Netflix, Adobe, Intuit, and eBay. On Mon, Apr 20, 2015 at 10:40 AM, Or Sher or.sh...@gmail.com wrote: OK. Thanks. I'll monitor the resources status (network, memory, cpu, io) as I go and try to bootsrap them at chunks which seems not to have a bad impact. Will do regarding the cleanup. Thanks! On Mon, Apr 20, 2015 at 4:08 PM, Carlos Rolo r...@pythian.com wrote: Independent of the snitch, data needs to travel to the new nodes (plus all the keyspace information that goes via gossip). So I won't bootstrap them all at once, even if it is only for network traffic generated. Don't forget to run cleanup on the old nodes once all nodes are in place to reclaim disk space. Regards, Carlos Juzarte Rolo Cassandra Consultant Pythian - Love your data rolo@pythian | Twitter: cjrolo | Linkedin: linkedin.com/in/carlosjuzarterolo Mobile: +31 6 159 61 814 | Tel: +1 613 565 8696 x1649 www.pythian.com On Mon, Apr 20, 2015 at 1:58 PM, Or Sher or.sh...@gmail.com wrote: Thanks for the response. Sure we'll monitor as we're adding nodes. We're now using 6 nodes on each DC. (We have 2 DCs) Each node contains ~800GB Do you know how rack configurations are relevant here? Do you see any reason to bootstrap them one by one if we're not using rack awareness? On Mon, Apr 20, 2015 at 2:49 PM, Carlos Rolo r...@pythian.com wrote: Start one node at a time. Wait 2 minutes before starting each node. How much data and nodes you have already? Depending on that, the streaming of data can stress on the resources you have. I would recommend to start one and monitor, if things are ok, add another one. And so on. Regards, Carlos Juzarte Rolo Cassandra Consultant Pythian - Love your data rolo@pythian | Twitter: cjrolo | Linkedin: linkedin.com/in/carlosjuzarterolo Mobile: +31 6 159 61 814 | Tel: +1 613 565 8696 x1649 www.pythian.com On Mon, Apr 20, 2015 at 11:02 AM, Or Sher or.sh...@gmail.com wrote: Hi all, In the near future I'll need to add more than 10 nodes to a 2.0.9 cluster (using vnodes). I read this documentation on datastax website: http://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_add_node_to_cluster_t.html In one point it says: If you are using racks, you can safely bootstrap two nodes at a time when both nodes are on the same rack. And in another is says: Start Cassandra on each new node. Allow two minutes between node initializations. You can monitor the startup and data streaming process using nodetool netstats. We're not using racks configuration and from reading this documentation I'm not really sure is it safe for us to bootstrap all nodes together (with two minutes between each other). I really hate the tought of doing it one by one, I assume it will take more than 6H per node. What do you say? -- Or Sher -- -- Or Sher -- -- Or Sher
Re: Adding nodes to existing cluster
OK. Thanks. I'll monitor the resources status (network, memory, cpu, io) as I go and try to bootsrap them at chunks which seems not to have a bad impact. Will do regarding the cleanup. Thanks! On Mon, Apr 20, 2015 at 4:08 PM, Carlos Rolo r...@pythian.com wrote: Independent of the snitch, data needs to travel to the new nodes (plus all the keyspace information that goes via gossip). So I won't bootstrap them all at once, even if it is only for network traffic generated. Don't forget to run cleanup on the old nodes once all nodes are in place to reclaim disk space. Regards, Carlos Juzarte Rolo Cassandra Consultant Pythian - Love your data rolo@pythian | Twitter: cjrolo | Linkedin: linkedin.com/in/carlosjuzarterolo Mobile: +31 6 159 61 814 | Tel: +1 613 565 8696 x1649 www.pythian.com On Mon, Apr 20, 2015 at 1:58 PM, Or Sher or.sh...@gmail.com wrote: Thanks for the response. Sure we'll monitor as we're adding nodes. We're now using 6 nodes on each DC. (We have 2 DCs) Each node contains ~800GB Do you know how rack configurations are relevant here? Do you see any reason to bootstrap them one by one if we're not using rack awareness? On Mon, Apr 20, 2015 at 2:49 PM, Carlos Rolo r...@pythian.com wrote: Start one node at a time. Wait 2 minutes before starting each node. How much data and nodes you have already? Depending on that, the streaming of data can stress on the resources you have. I would recommend to start one and monitor, if things are ok, add another one. And so on. Regards, Carlos Juzarte Rolo Cassandra Consultant Pythian - Love your data rolo@pythian | Twitter: cjrolo | Linkedin: linkedin.com/in/carlosjuzarterolo Mobile: +31 6 159 61 814 | Tel: +1 613 565 8696 x1649 www.pythian.com On Mon, Apr 20, 2015 at 11:02 AM, Or Sher or.sh...@gmail.com wrote: Hi all, In the near future I'll need to add more than 10 nodes to a 2.0.9 cluster (using vnodes). I read this documentation on datastax website: http://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_add_node_to_cluster_t.html In one point it says: If you are using racks, you can safely bootstrap two nodes at a time when both nodes are on the same rack. And in another is says: Start Cassandra on each new node. Allow two minutes between node initializations. You can monitor the startup and data streaming process using nodetool netstats. We're not using racks configuration and from reading this documentation I'm not really sure is it safe for us to bootstrap all nodes together (with two minutes between each other). I really hate the tought of doing it one by one, I assume it will take more than 6H per node. What do you say? -- Or Sher -- -- Or Sher -- -- Or Sher