Re: Migrating from a windows cluster to a linux cluster.

2012-05-29 Thread Conan Cook
Hi,

We were trying to do a similar kind of migration (to a new cluster, no
downtime) in order to remove a legacy OrderedPartitioner limitation.  In
the end we were allowed enough downtime to migrate, but originally we were
proposing a similar solution based around deploying an update to the
application to write to two clusters simultaneously, and a background copy
of older data in some way.

I'd love to hear how the migration went, and whether there were any
(un)expected hurdles along the way!

Thanks,


Conan

On 24 May 2012 23:56, Rob Coli rc...@palominodb.com wrote:

 On Thu, May 24, 2012 at 12:44 PM, Steve Neely sne...@rallydev.com wrote:
  It also seems like a dark deployment of your new cluster is a great
 method
  for testing the Linux-based systems before switching your mision critical
  traffic over. Monitor them for a while with real traffic and you can have
  confidence that they'll function correctly when you perform the
 switchover.

 FWIW, I would love to see graphs which show their compared performance
 under identical write load and then show the cut-over point for reads
 between the two clusters. My hypothesis is that your linux cluster
 will magically be much more perfomant/less loaded due to many
 linux-specific optimizations in Cassandra, but I'd dig seeing this
 illustrated in an apples to apples sense with real app traffic.

 =Rob

 --
 =Robert Coli
 AIMGTALK - rc...@palominodb.com
 YAHOO - rcoli.palominob
 SKYPE - rcoli_palominodb



Migrating from a windows cluster to a linux cluster.

2012-05-24 Thread Henrik Schröder
Hey everyone,

We're trying to migrate a cassandra cluster from a bunch of Windows
machines to a bunch of (newer and more powerful) Linux machines.

Our initial plan was to simply bootstrap the Linux servers into the cluster
one by one, and then decommission the old servers one by one. However, when
we try to join a Linux server to the cluster, we get the following error:

ERROR 11:52:22,959 Fatal exception in thread Thread[Thread-21,5,main]
java.lang.AssertionError: Filename must include parent directory.
at
org.apache.cassandra.io.sstable.Descriptor.fromFilename(Descriptor.java:148)
at
org.apache.cassandra.streaming.PendingFile$PendingFileSerializer.deserialize(PendingFile.java:138)
at
org.apache.cassandra.streaming.StreamHeader$StreamHeaderSerializer.deserialize(StreamHeader.java:88)
at
org.apache.cassandra.streaming.StreamHeader$StreamHeaderSerializer.deserialize(StreamHeader.java:70)
at
org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:80)

A quick googling reveals that the cause is the simple fact that Cassandra
is transmitting the full path of the datafiles with the native directory
separator, \, and the Linux servers expect it to be /, and get confused
as a result.

We're running version 1.0.8. Is this fixed in a later release? Will this be
fixed in a later release?

Are there any other ways of doing the migration? What happens if we join
the new servers without bootstrapping and run repair? Are there any other
ugly hacks or workaround we can do? We're not looking to run a mixed
cluster, we just want to migrate all the data as painlessly as possible.


/Henrik


Re: Migrating from a windows cluster to a linux cluster.

2012-05-24 Thread Henrik Schröder
Hey, we thought a bit about it and came up with another solution:

We shut down Cassandra on one of the windows servers, copy over the data
directory to one of the Linux servers, delete the LocationInfo files from
the system keyspace, and start it up.

It should read the saved token from the datafiles, it should have all the
data associated with that token, and on joining the cluster it should just
pop in at the right place, but with a new ip address. And then we repeat
that for each server.

Will this work? Or is there a better way?


/Henrik

On Thu, May 24, 2012 at 7:41 PM, Henrik Schröder skro...@gmail.com wrote:

 Hey everyone,

 We're trying to migrate a cassandra cluster from a bunch of Windows
 machines to a bunch of (newer and more powerful) Linux machines.

 Our initial plan was to simply bootstrap the Linux servers into the
 cluster one by one, and then decommission the old servers one by one.
 However, when we try to join a Linux server to the cluster, we get the
 following error:

 ERROR 11:52:22,959 Fatal exception in thread Thread[Thread-21,5,main]
 java.lang.AssertionError: Filename must include parent directory.
 at
 org.apache.cassandra.io.sstable.Descriptor.fromFilename(Descriptor.java:148)
 at
 org.apache.cassandra.streaming.PendingFile$PendingFileSerializer.deserialize(PendingFile.java:138)
 at
 org.apache.cassandra.streaming.StreamHeader$StreamHeaderSerializer.deserialize(StreamHeader.java:88)
 at
 org.apache.cassandra.streaming.StreamHeader$StreamHeaderSerializer.deserialize(StreamHeader.java:70)
 at
 org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:80)

 A quick googling reveals that the cause is the simple fact that Cassandra
 is transmitting the full path of the datafiles with the native directory
 separator, \, and the Linux servers expect it to be /, and get confused
 as a result.

 We're running version 1.0.8. Is this fixed in a later release? Will this
 be fixed in a later release?

 Are there any other ways of doing the migration? What happens if we join
 the new servers without bootstrapping and run repair? Are there any other
 ugly hacks or workaround we can do? We're not looking to run a mixed
 cluster, we just want to migrate all the data as painlessly as possible.


 /Henrik



Re: Migrating from a windows cluster to a linux cluster.

2012-05-24 Thread Brandon Williams
On Thu, May 24, 2012 at 12:41 PM, Henrik Schröder skro...@gmail.com wrote:
 We're running version 1.0.8. Is this fixed in a later release? Will this be
 fixed in a later release?

No, mixed-OS clusters are unsupported.

 Are there any other ways of doing the migration? What happens if we join the
 new servers without bootstrapping and run repair? Are there any other ugly
 hacks or workaround we can do? We're not looking to run a mixed cluster, we
 just want to migrate all the data as painlessly as possible.

Start the linux cluster independently and use sstableloader from the
windows cluster to populate it.

-Brandon


Re: Migrating from a windows cluster to a linux cluster.

2012-05-24 Thread Henrik Schröder
On Thu, May 24, 2012 at 8:07 PM, Brandon Williams dri...@gmail.com wrote:

  Are there any other ways of doing the migration? What happens if we join
 the
  new servers without bootstrapping and run repair? Are there any other
 ugly
  hacks or workaround we can do? We're not looking to run a mixed cluster,
 we
  just want to migrate all the data as painlessly as possible.

 Start the linux cluster independently and use sstableloader from the
 windows cluster to populate it.


Ok. It's important for us to not have any downtime, so how about this
solution:

We startup the Linux cluster independently.
We configure our application to send all Cassandra writes to both clusters,
but only read from the Windows cluster.
We run sstableloader on each windows server (Is it possible to do in
parallell?), sending whatever it has to the Linux cluster.
When it's done on all Windows servers, we configure our application to only
talk to the Linux cluster.

The only issue with this is the timestamps of the data and tombstones in
each sstable, will they be preserved by sstableloader? What about deletes
of non-existing keys? Will they be stored in the Linux cluster so that when
sstableloader inserts the key later, it's resolved as being deleted?


/Henrik


Re: Migrating from a windows cluster to a linux cluster.

2012-05-24 Thread Brandon Williams
On Thu, May 24, 2012 at 1:50 PM, Henrik Schröder skro...@gmail.com wrote:
 Ok. It's important for us to not have any downtime, so how about this
 solution:

 We startup the Linux cluster independently.
 We configure our application to send all Cassandra writes to both clusters,
 but only read from the Windows cluster.
 We run sstableloader on each windows server (Is it possible to do in
 parallell?), sending whatever it has to the Linux cluster.
 When it's done on all Windows servers, we configure our application to only
 talk to the Linux cluster.

That sounds fine, with the caveat that you can't run sstableloader
from a machine running Cassandra before 1.1, so copying the sstables
manually (assuming both clusters are the same size and have the same
tokens) might be better.

 The only issue with this is the timestamps of the data and tombstones in
 each sstable, will they be preserved by sstableloader? What about deletes of
 non-existing keys? Will they be stored in the Linux cluster so that when
 sstableloader inserts the key later, it's resolved as being deleted?

None of that should be a problem.

-Brandon


Re: Migrating from a windows cluster to a linux cluster.

2012-05-24 Thread Steve Neely
It also seems like a dark deployment of your new cluster is a great method
for testing the Linux-based systems *before* switching your mision critical
traffic over. Monitor them for a while with real traffic and you can have
confidence that they'll function correctly when you perform the switchover.

-- Steve


On Thu, May 24, 2012 at 1:28 PM, Brandon Williams dri...@gmail.com wrote:

 On Thu, May 24, 2012 at 1:50 PM, Henrik Schröder skro...@gmail.com
 wrote:
  Ok. It's important for us to not have any downtime, so how about this
  solution:
 
  We startup the Linux cluster independently.
  We configure our application to send all Cassandra writes to both
 clusters,
  but only read from the Windows cluster.
  We run sstableloader on each windows server (Is it possible to do in
  parallell?), sending whatever it has to the Linux cluster.
  When it's done on all Windows servers, we configure our application to
 only
  talk to the Linux cluster.

 That sounds fine, with the caveat that you can't run sstableloader
 from a machine running Cassandra before 1.1, so copying the sstables
 manually (assuming both clusters are the same size and have the same
 tokens) might be better.

  The only issue with this is the timestamps of the data and tombstones in
  each sstable, will they be preserved by sstableloader? What about
 deletes of
  non-existing keys? Will they be stored in the Linux cluster so that when
  sstableloader inserts the key later, it's resolved as being deleted?

 None of that should be a problem.

 -Brandon



Re: Migrating from a windows cluster to a linux cluster.

2012-05-24 Thread Henrik Schröder
On Thu, May 24, 2012 at 9:28 PM, Brandon Williams dri...@gmail.com wrote:


 That sounds fine, with the caveat that you can't run sstableloader
 from a machine running Cassandra before 1.1, so copying the sstables
 manually (assuming both clusters are the same size and have the same
 tokens) might be better.


Why is version 1.1 required for sstableloader? We're running 1.0.x on both
clusters, but we can of course upgrade if that's required.


  The only issue with this is the timestamps of the data and tombstones in
  each sstable, will they be preserved by sstableloader? What about
 deletes of
  non-existing keys? Will they be stored in the Linux cluster so that when
  sstableloader inserts the key later, it's resolved as being deleted?

 None of that should be a problem.


Excellent, thanks!


/Henrik


Re: Migrating from a windows cluster to a linux cluster.

2012-05-24 Thread Brandon Williams
On Thu, May 24, 2012 at 3:36 PM, Henrik Schröder skro...@gmail.com wrote:
 That sounds fine, with the caveat that you can't run sstableloader
 from a machine running Cassandra before 1.1, so copying the sstables
 manually (assuming both clusters are the same size and have the same
 tokens) might be better.


 Why is version 1.1 required for sstableloader? We're running 1.0.x on both
 clusters, but we can of course upgrade if that's required.

Before 1.1 sstableloader is a fat client, and thus can't coexist with
an existing Cassandra instance on the same machine.

-Brandon


Re: Migrating from a windows cluster to a linux cluster.

2012-05-24 Thread Rob Coli
On Thu, May 24, 2012 at 12:44 PM, Steve Neely sne...@rallydev.com wrote:
 It also seems like a dark deployment of your new cluster is a great method
 for testing the Linux-based systems before switching your mision critical
 traffic over. Monitor them for a while with real traffic and you can have
 confidence that they'll function correctly when you perform the switchover.

FWIW, I would love to see graphs which show their compared performance
under identical write load and then show the cut-over point for reads
between the two clusters. My hypothesis is that your linux cluster
will magically be much more perfomant/less loaded due to many
linux-specific optimizations in Cassandra, but I'd dig seeing this
illustrated in an apples to apples sense with real app traffic.

=Rob

-- 
=Robert Coli
AIMGTALK - rc...@palominodb.com
YAHOO - rcoli.palominob
SKYPE - rcoli_palominodb