Re: Cassandra cluster migration in Amazon EC2
On Mon, Sep 2, 2013 at 4:21 PM, Renat Gilfanov gren...@mail.ru wrote: - Group 3 of storages into raid0 array, move data directory to the raid0, and commit log - to the 4th left storage. - As far as I understand, separation of commit log and data directory should make performance better - but what about separation the OS from those two - is it worth doing? Nope. Best practice for amazon is ephemeral disks, and RAID0 for data + commit log. - What are the steps to perform such migration? Will it be possible to perform it without downtime, restarting node by node with new configuration applied? I'm especially worried about IP changes, when we'll uprade the instance type. What's the recomended way to handle those IP changes? Just set auto_bootstrap:false in cassandra.yaml to change the IP address of a node to which you have copied all the data its token had before its IP address changed and therefore does not need to be bootstrapped. =Rob
Cassandra cluster migration in Amazon EC2
Hello, Currently we have a Cassandra cluster in the Amazon EC2, and we are planning to upgrade our deployment configuration to achieve better performance and stability. However, a lot of open questions arise when planning this migration. I'll be very thankfull if somebody could answer my questions. Current state: We use Apache Cassandra 1.2.8, on 5 nodes deployed in the Amazon EC2, on m1.large instanses with EBS volume each. In Cassandra we have set up 2 datacenters, first one have 3 nodes each in the separate rack, second - 2 nodes in the separate rack, however all Amazon instances belong to the same region and even availability zone. The replication factor for our keyspace is the following: {'class': 'NetworkTopologyStrategy', 'DC2': '1', 'DC1': '2'}. We have virtual nodes enabled, however the shuffle hasn't been completed yet, and the nodes unballanced. What we want to achieve: - We would like to move to the M1 Extra Large instances with 4x420 Gb instance storages. - Group 3 of storages into raid0 array, move data directory to the raid0, and commit log - to the 4th left storage. Open questions: - Does the suggested configuration look reasonable from the performance optimization point of view? - As far as I understand, separation of commit log and data directory should make performance better - but what about separation the OS from those two - is it worth doing? - What are the steps to perform such migration? Will it be possible to perform it without downtime, restarting node by node with new configuration applied? I'm especially worried about IP changes, when we'll uprade the instance type. What's the recomended way to handle those IP changes? Best Regards, Renat.
Re: Cassandra cluster migration in Amazon EC2
If you launch the new servers, have them join the cluster, then decommission the old ones, you'll be able to do it without downtime. It'll also have the effect of randomizing the tokens, I believe. On Sep 2, 2013, at 4:21 PM, Renat Gilfanov gren...@mail.ru wrote: Hello, Currently we have a Cassandra cluster in the Amazon EC2, and we are planning to upgrade our deployment configuration to achieve better performance and stability. However, a lot of open questions arise when planning this migration. I'll be very thankfull if somebody could answer my questions. Current state: We use Apache Cassandra 1.2.8, on 5 nodes deployed in the Amazon EC2, on m1.large instanses with EBS volume each. In Cassandra we have set up 2 datacenters, first one have 3 nodes each in the separate rack, second - 2 nodes in the separate rack, however all Amazon instances belong to the same region and even availability zone. The replication factor for our keyspace is the following: {'class': 'NetworkTopologyStrategy', 'DC2': '1', 'DC1': '2'}. We have virtual nodes enabled, however the shuffle hasn't been completed yet, and the nodes unballanced. What we want to achieve: - We would like to move to the M1 Extra Large instances with 4x420 Gb instance storages. - Group 3 of storages into raid0 array, move data directory to the raid0, and commit log - to the 4th left storage. Open questions: - Does the suggested configuration look reasonable from the performance optimization point of view? - As far as I understand, separation of commit log and data directory should make performance better - but what about separation the OS from those two - is it worth doing? - What are the steps to perform such migration? Will it be possible to perform it without downtime, restarting node by node with new configuration applied? I'm especially worried about IP changes, when we'll uprade the instance type. What's the recomended way to handle those IP changes? Best Regards, Renat.
Re[2]: Cassandra cluster migration in Amazon EC2
Thanks for the quick reply! If I launch the new Cassandra node, should I preliminary add it's IP to the cassandra-topology.properties and seeds parameter in the cassandra.yaml on all existing nodes and restart them? If you launch the new servers, have them join the cluster, then decommission the old ones, you'll be able to do it without downtime. It'll also have the effect of randomizing the tokens, I believe. On Sep 2, 2013, at 4:21 PM, Renat Gilfanov gren...@mail.ru wrote: Hello, Currently we have a Cassandra cluster in the Amazon EC2, and we are planning to upgrade our deployment configuration to achieve better performance and stability. However, a lot of open questions arise when planning this migration. I'll be very thankfull if somebody could answer my questions. Current state: We use Apache Cassandra 1.2.8, on 5 nodes deployed in the Amazon EC2, on m1.large instanses with EBS volume each. In Cassandra we have set up 2 datacenters, first one have 3 nodes each in the separate rack, second - 2 nodes in the separate rack, however all Amazon instances belong to the same region and even availability zone. The replication factor for our keyspace is the following: {'class': 'NetworkTopologyStrategy', 'DC2': '1', 'DC1': '2'}. We have virtual nodes enabled, however the shuffle hasn't been completed yet, and the nodes unballanced. What we want to achieve: - We would like to move to the M1 Extra Large instances with 4x420 Gb instance storages. - Group 3 of storages into raid0 array, move data directory to the raid0, and commit log - to the 4th left storage. Open questions: - Does the suggested configuration look reasonable from the performance optimization point of view? - As far as I understand, separation of commit log and data directory should make performance better - but what about separation the OS from those two - is it worth doing? - What are the steps to perform such migration? Will it be possible to perform it without downtime, restarting node by node with new configuration applied? I'm especially worried about IP changes, when we'll uprade the instance type. What's the recomended way to handle those IP changes? Best Regards, Renat. -- Renat Gilfanov