Re: Cassandra cluster migration in Amazon EC2

2013-09-03 Thread Robert Coli
On Mon, Sep 2, 2013 at 4:21 PM, Renat Gilfanov gren...@mail.ru wrote:

 - Group 3 of storages into raid0 array, move data directory to the raid0,
 and commit log - to the 4th left storage.
  - As far as I understand, separation of commit log and data directory
 should make performance better - but what about separation the OS from
 those two  - is it worth doing?


Nope. Best practice for amazon is ephemeral disks, and RAID0 for data +
commit log.


  - What are the steps to perform such migration? Will it be possible to
 perform it without downtime, restarting node by node with new configuration
 applied?
  I'm especially worried about IP changes, when we'll uprade the instance
 type. What's the recomended way to handle those IP changes?


Just set auto_bootstrap:false in cassandra.yaml to change the IP address of
a node to which you have copied all the data its token had before its IP
address changed and therefore does not need to be bootstrapped.

=Rob


Cassandra cluster migration in Amazon EC2

2013-09-02 Thread Renat Gilfanov
 Hello,

Currently we have a Cassandra cluster in the Amazon EC2, and we are planning to 
upgrade our deployment configuration to achieve better 
performance and stability. However, a lot of open questions arise when planning 
this migration. I'll be very thankfull if somebody could answer my 
questions.

Current state:

We use Apache Cassandra 1.2.8, on 5 nodes deployed in the Amazon EC2, on 
m1.large instanses with EBS volume each. In Cassandra we have set up 2 
datacenters, first one have 3 nodes each in the separate rack, second - 2 nodes 
in the separate rack, however all Amazon instances belong to the 
same region and even availability zone. The replication factor for our keyspace 
is the following: {'class': 'NetworkTopologyStrategy',  'DC2': '1',  'DC1': 
'2'}.
We have virtual nodes enabled, however the shuffle hasn't been completed yet, 
and the nodes unballanced.

What we want to achieve:

- We would like to move to the M1 Extra Large instances with 4x420 Gb instance 
storages. 
- Group 3 of storages into raid0 array, move data directory to the raid0, and 
commit log - to the 4th left storage.

Open questions:
 - Does the suggested configuration look reasonable from the performance 
optimization point of view?
 - As far as I understand, separation of commit log and data directory should 
make performance better - but what about separation the OS from those two  - is 
it worth doing?
 - What are the steps to perform such migration? Will it be possible to perform 
it without downtime, restarting node by node with new configuration applied?
 I'm especially worried about IP changes, when we'll uprade the instance type. 
What's the recomended way to handle those IP changes?

Best Regards,
Renat.

Re: Cassandra cluster migration in Amazon EC2

2013-09-02 Thread Jon Haddad
If you launch the new servers, have them join the cluster, then decommission 
the old ones, you'll be able to do it without downtime.  It'll also have the 
effect of randomizing the tokens, I believe. 

On Sep 2, 2013, at 4:21 PM, Renat Gilfanov gren...@mail.ru wrote:

 Hello,
 
 Currently we have a Cassandra cluster in the Amazon EC2, and we are planning 
 to upgrade our deployment configuration to achieve better 
 performance and stability. However, a lot of open questions arise when 
 planning this migration. I'll be very thankfull if somebody could answer my 
 questions.
 
 Current state:
 
 We use Apache Cassandra 1.2.8, on 5 nodes deployed in the Amazon EC2, on 
 m1.large instanses with EBS volume each. In Cassandra we have set up 2 
 datacenters, first one have 3 nodes each in the separate rack, second - 2 
 nodes in the separate rack, however all Amazon instances belong to the 
 same region and even availability zone. The replication factor for our 
 keyspace is the following: {'class': 'NetworkTopologyStrategy',  'DC2': '1',  
 'DC1': '2'}.
 We have virtual nodes enabled, however the shuffle hasn't been completed yet, 
 and the nodes unballanced.
 
 What we want to achieve:
 
 - We would like to move to the M1 Extra Large instances with 4x420 Gb 
 instance storages. 
 - Group 3 of storages into raid0 array, move data directory to the raid0, and 
 commit log - to the 4th left storage.
 
 Open questions:
  - Does the suggested configuration look reasonable from the performance 
 optimization point of view?
  - As far as I understand, separation of commit log and data directory should 
 make performance better - but what about separation the OS from those two  - 
 is it worth doing?
  - What are the steps to perform such migration? Will it be possible to 
 perform it without downtime, restarting node by node with new configuration 
 applied?
  I'm especially worried about IP changes, when we'll uprade the instance 
 type. What's the recomended way to handle those IP changes?
 
 Best Regards,
 Renat.



Re[2]: Cassandra cluster migration in Amazon EC2

2013-09-02 Thread Renat Gilfanov
 Thanks for the quick reply!

If  I launch the new Cassandra node, should I preliminary add it's IP to the 
cassandra-topology.properties and seeds parameter in the cassandra.yaml on 
all existing nodes and restart them?

If you launch the new servers, have them join the cluster, then decommission 
the old ones, you'll be able to do it without downtime.  It'll also have the 
effect of randomizing the tokens, I believe. 

On Sep 2, 2013, at 4:21 PM, Renat Gilfanov  gren...@mail.ru  wrote:

 Hello,
 
 Currently we have a Cassandra cluster in the Amazon EC2, and we are planning 
 to upgrade our deployment configuration to achieve better 
 performance and stability. However, a lot of open questions arise when 
 planning this migration. I'll be very thankfull if somebody could answer my 
 questions.
 
 Current state:
 
 We use Apache Cassandra 1.2.8, on 5 nodes deployed in the Amazon EC2, on 
 m1.large instanses with EBS volume each. In Cassandra we have set up 2 
 datacenters, first one have 3 nodes each in the separate rack, second - 2 
 nodes in the separate rack, however all Amazon instances belong to the 
 same region and even availability zone. The replication factor for our 
 keyspace is the following: {'class': 'NetworkTopologyStrategy',  'DC2': '1', 
  'DC1': '2'}.
 We have virtual nodes enabled, however the shuffle hasn't been completed 
 yet, and the nodes unballanced.
 
 What we want to achieve:
 
 - We would like to move to the M1 Extra Large instances with 4x420 Gb 
 instance storages. 
 - Group 3 of storages into raid0 array, move data directory to the raid0, 
 and commit log - to the 4th left storage.
 
 Open questions:
  - Does the suggested configuration look reasonable from the performance 
 optimization point of view?
  - As far as I understand, separation of commit log and data directory 
 should make performance better - but what about separation the OS from those 
 two  - is it worth doing?
  - What are the steps to perform such migration? Will it be possible to 
 perform it without downtime, restarting node by node with new configuration 
 applied?
  I'm especially worried about IP changes, when we'll uprade the instance 
 type. What's the recomended way to handle those IP changes?
 
 Best Regards,
 Renat.



-- 
Renat Gilfanov