More Migration Questions

Shane McEwan Thu, 08 Nov 2012 13:10:17 -0800

G'day!

Just to add to the list of people asking questions about migrating to1.2.1 . . .

We're about to migrate our 4 node production Riak database from 1.1.1 to1.2.1. At the same time we're also migrating from virtual machines tophysical machines. These machines will have new names and IP addresses.

The process of doing rolling upgrades is well documented but I'm unsureof the correct procedure for moving to an entirely new cluster.

We have the luxury of a maintenance window so we don't need to keepeverything running during the migration. Therefore the current plan isto stop the current cluster, copy the Riak data directories to the newmachines and start up the new cluster. The hazy part of the process ishow we "reip" the database so it will work in the new cluster.

We've tried using the "riak-admin reip" command but were left with oneof our nodes in "(legacy)" mode according to "riak-admin member-status".From an earlier E-Mail thread[1] it seems like "reip" is deprecated andwe should be doing a "cluster force replace" instead.


So, would the new procedure be the following?

1. Shutdown old cluster
2. Copy data directory

3. Start new cluster (QUESTION: The new nodes don't own any of thepartitions in the data directory. What does it do?) (QUESTION: The newnodes won't be part of a cluster yet. Do I need to "join" them before Ican do any of the following commands? Or do I just put all the joins andforce-replace commands into the same plan and commit it all together?)3. Issue "riak-admin cluster force-replace old-node1 new-node1"(QUESTION: Do I run this command just on "new-node1" or on all nodes?)

4. Issue "force-replace" commands for the remaining three nodes.
5. Issue a "cluster plan" and "cluster commit" to commit the changes.
6. Cross fingers.

In my mind the "replace" and/or "force-replace" commands are somethingwe would use it we had a failed node and needed to bring a spare onlineto take over. It doesn't feel like something you would do if you don'talready have a cluster in place and are needing to "replace" ALL nodes.

Of course, we want to test this procedure before doing it for real. Whatare the risks of doing the above procedure while the old cluster isstill running? While the new nodes are on a segregated network andshouldn't be able to contact the old nodes what would happen if we didthe above and found the network wasn't as segregated as we originallythought? Would the new nodes start trying to communicate with the oldnodes before the "force-replace" can take effect? Or, because all thecluster changes are atomic there won't be any risk of that?

Sorry for all the questions. I'm just trying to get a clear procedurefor moving an entire cluster to new hardware and hopefully this threadwill help other people in the future.


Thanks in advance!

Shane.

[1] http://comments.gmane.org/gmane.comp.db.riak.user/8418


_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

More Migration Questions

Reply via email to