Re: Migrating from a windows cluster to a linux cluster.
Hi, We were trying to do a similar kind of migration (to a new cluster, no downtime) in order to remove a legacy OrderedPartitioner limitation. In the end we were allowed enough downtime to migrate, but originally we were proposing a similar solution based around deploying an update to the application to write to two clusters simultaneously, and a background copy of older data in some way. I'd love to hear how the migration went, and whether there were any (un)expected hurdles along the way! Thanks, Conan On 24 May 2012 23:56, Rob Coli rc...@palominodb.com wrote: On Thu, May 24, 2012 at 12:44 PM, Steve Neely sne...@rallydev.com wrote: It also seems like a dark deployment of your new cluster is a great method for testing the Linux-based systems before switching your mision critical traffic over. Monitor them for a while with real traffic and you can have confidence that they'll function correctly when you perform the switchover. FWIW, I would love to see graphs which show their compared performance under identical write load and then show the cut-over point for reads between the two clusters. My hypothesis is that your linux cluster will magically be much more perfomant/less loaded due to many linux-specific optimizations in Cassandra, but I'd dig seeing this illustrated in an apples to apples sense with real app traffic. =Rob -- =Robert Coli AIMGTALK - rc...@palominodb.com YAHOO - rcoli.palominob SKYPE - rcoli_palominodb
Migrating from a windows cluster to a linux cluster.
Hey everyone, We're trying to migrate a cassandra cluster from a bunch of Windows machines to a bunch of (newer and more powerful) Linux machines. Our initial plan was to simply bootstrap the Linux servers into the cluster one by one, and then decommission the old servers one by one. However, when we try to join a Linux server to the cluster, we get the following error: ERROR 11:52:22,959 Fatal exception in thread Thread[Thread-21,5,main] java.lang.AssertionError: Filename must include parent directory. at org.apache.cassandra.io.sstable.Descriptor.fromFilename(Descriptor.java:148) at org.apache.cassandra.streaming.PendingFile$PendingFileSerializer.deserialize(PendingFile.java:138) at org.apache.cassandra.streaming.StreamHeader$StreamHeaderSerializer.deserialize(StreamHeader.java:88) at org.apache.cassandra.streaming.StreamHeader$StreamHeaderSerializer.deserialize(StreamHeader.java:70) at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:80) A quick googling reveals that the cause is the simple fact that Cassandra is transmitting the full path of the datafiles with the native directory separator, \, and the Linux servers expect it to be /, and get confused as a result. We're running version 1.0.8. Is this fixed in a later release? Will this be fixed in a later release? Are there any other ways of doing the migration? What happens if we join the new servers without bootstrapping and run repair? Are there any other ugly hacks or workaround we can do? We're not looking to run a mixed cluster, we just want to migrate all the data as painlessly as possible. /Henrik
Re: Migrating from a windows cluster to a linux cluster.
Hey, we thought a bit about it and came up with another solution: We shut down Cassandra on one of the windows servers, copy over the data directory to one of the Linux servers, delete the LocationInfo files from the system keyspace, and start it up. It should read the saved token from the datafiles, it should have all the data associated with that token, and on joining the cluster it should just pop in at the right place, but with a new ip address. And then we repeat that for each server. Will this work? Or is there a better way? /Henrik On Thu, May 24, 2012 at 7:41 PM, Henrik Schröder skro...@gmail.com wrote: Hey everyone, We're trying to migrate a cassandra cluster from a bunch of Windows machines to a bunch of (newer and more powerful) Linux machines. Our initial plan was to simply bootstrap the Linux servers into the cluster one by one, and then decommission the old servers one by one. However, when we try to join a Linux server to the cluster, we get the following error: ERROR 11:52:22,959 Fatal exception in thread Thread[Thread-21,5,main] java.lang.AssertionError: Filename must include parent directory. at org.apache.cassandra.io.sstable.Descriptor.fromFilename(Descriptor.java:148) at org.apache.cassandra.streaming.PendingFile$PendingFileSerializer.deserialize(PendingFile.java:138) at org.apache.cassandra.streaming.StreamHeader$StreamHeaderSerializer.deserialize(StreamHeader.java:88) at org.apache.cassandra.streaming.StreamHeader$StreamHeaderSerializer.deserialize(StreamHeader.java:70) at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:80) A quick googling reveals that the cause is the simple fact that Cassandra is transmitting the full path of the datafiles with the native directory separator, \, and the Linux servers expect it to be /, and get confused as a result. We're running version 1.0.8. Is this fixed in a later release? Will this be fixed in a later release? Are there any other ways of doing the migration? What happens if we join the new servers without bootstrapping and run repair? Are there any other ugly hacks or workaround we can do? We're not looking to run a mixed cluster, we just want to migrate all the data as painlessly as possible. /Henrik
Re: Migrating from a windows cluster to a linux cluster.
On Thu, May 24, 2012 at 12:41 PM, Henrik Schröder skro...@gmail.com wrote: We're running version 1.0.8. Is this fixed in a later release? Will this be fixed in a later release? No, mixed-OS clusters are unsupported. Are there any other ways of doing the migration? What happens if we join the new servers without bootstrapping and run repair? Are there any other ugly hacks or workaround we can do? We're not looking to run a mixed cluster, we just want to migrate all the data as painlessly as possible. Start the linux cluster independently and use sstableloader from the windows cluster to populate it. -Brandon
Re: Migrating from a windows cluster to a linux cluster.
On Thu, May 24, 2012 at 8:07 PM, Brandon Williams dri...@gmail.com wrote: Are there any other ways of doing the migration? What happens if we join the new servers without bootstrapping and run repair? Are there any other ugly hacks or workaround we can do? We're not looking to run a mixed cluster, we just want to migrate all the data as painlessly as possible. Start the linux cluster independently and use sstableloader from the windows cluster to populate it. Ok. It's important for us to not have any downtime, so how about this solution: We startup the Linux cluster independently. We configure our application to send all Cassandra writes to both clusters, but only read from the Windows cluster. We run sstableloader on each windows server (Is it possible to do in parallell?), sending whatever it has to the Linux cluster. When it's done on all Windows servers, we configure our application to only talk to the Linux cluster. The only issue with this is the timestamps of the data and tombstones in each sstable, will they be preserved by sstableloader? What about deletes of non-existing keys? Will they be stored in the Linux cluster so that when sstableloader inserts the key later, it's resolved as being deleted? /Henrik
Re: Migrating from a windows cluster to a linux cluster.
On Thu, May 24, 2012 at 1:50 PM, Henrik Schröder skro...@gmail.com wrote: Ok. It's important for us to not have any downtime, so how about this solution: We startup the Linux cluster independently. We configure our application to send all Cassandra writes to both clusters, but only read from the Windows cluster. We run sstableloader on each windows server (Is it possible to do in parallell?), sending whatever it has to the Linux cluster. When it's done on all Windows servers, we configure our application to only talk to the Linux cluster. That sounds fine, with the caveat that you can't run sstableloader from a machine running Cassandra before 1.1, so copying the sstables manually (assuming both clusters are the same size and have the same tokens) might be better. The only issue with this is the timestamps of the data and tombstones in each sstable, will they be preserved by sstableloader? What about deletes of non-existing keys? Will they be stored in the Linux cluster so that when sstableloader inserts the key later, it's resolved as being deleted? None of that should be a problem. -Brandon
Re: Migrating from a windows cluster to a linux cluster.
It also seems like a dark deployment of your new cluster is a great method for testing the Linux-based systems *before* switching your mision critical traffic over. Monitor them for a while with real traffic and you can have confidence that they'll function correctly when you perform the switchover. -- Steve On Thu, May 24, 2012 at 1:28 PM, Brandon Williams dri...@gmail.com wrote: On Thu, May 24, 2012 at 1:50 PM, Henrik Schröder skro...@gmail.com wrote: Ok. It's important for us to not have any downtime, so how about this solution: We startup the Linux cluster independently. We configure our application to send all Cassandra writes to both clusters, but only read from the Windows cluster. We run sstableloader on each windows server (Is it possible to do in parallell?), sending whatever it has to the Linux cluster. When it's done on all Windows servers, we configure our application to only talk to the Linux cluster. That sounds fine, with the caveat that you can't run sstableloader from a machine running Cassandra before 1.1, so copying the sstables manually (assuming both clusters are the same size and have the same tokens) might be better. The only issue with this is the timestamps of the data and tombstones in each sstable, will they be preserved by sstableloader? What about deletes of non-existing keys? Will they be stored in the Linux cluster so that when sstableloader inserts the key later, it's resolved as being deleted? None of that should be a problem. -Brandon
Re: Migrating from a windows cluster to a linux cluster.
On Thu, May 24, 2012 at 9:28 PM, Brandon Williams dri...@gmail.com wrote: That sounds fine, with the caveat that you can't run sstableloader from a machine running Cassandra before 1.1, so copying the sstables manually (assuming both clusters are the same size and have the same tokens) might be better. Why is version 1.1 required for sstableloader? We're running 1.0.x on both clusters, but we can of course upgrade if that's required. The only issue with this is the timestamps of the data and tombstones in each sstable, will they be preserved by sstableloader? What about deletes of non-existing keys? Will they be stored in the Linux cluster so that when sstableloader inserts the key later, it's resolved as being deleted? None of that should be a problem. Excellent, thanks! /Henrik
Re: Migrating from a windows cluster to a linux cluster.
On Thu, May 24, 2012 at 3:36 PM, Henrik Schröder skro...@gmail.com wrote: That sounds fine, with the caveat that you can't run sstableloader from a machine running Cassandra before 1.1, so copying the sstables manually (assuming both clusters are the same size and have the same tokens) might be better. Why is version 1.1 required for sstableloader? We're running 1.0.x on both clusters, but we can of course upgrade if that's required. Before 1.1 sstableloader is a fat client, and thus can't coexist with an existing Cassandra instance on the same machine. -Brandon
Re: Migrating from a windows cluster to a linux cluster.
On Thu, May 24, 2012 at 12:44 PM, Steve Neely sne...@rallydev.com wrote: It also seems like a dark deployment of your new cluster is a great method for testing the Linux-based systems before switching your mision critical traffic over. Monitor them for a while with real traffic and you can have confidence that they'll function correctly when you perform the switchover. FWIW, I would love to see graphs which show their compared performance under identical write load and then show the cut-over point for reads between the two clusters. My hypothesis is that your linux cluster will magically be much more perfomant/less loaded due to many linux-specific optimizations in Cassandra, but I'd dig seeing this illustrated in an apples to apples sense with real app traffic. =Rob -- =Robert Coli AIMGTALK - rc...@palominodb.com YAHOO - rcoli.palominob SKYPE - rcoli_palominodb