Thanks, Jonathan!

Part of what we're trying to accomplish is a data cleanup. One of our nodes seems to have some lingering data from an old column family that we no longer have defined (we're running v0.60) so that node has a few GB of data that never gets replicated. We're hoping that by bringing that node offline, that we could flush out that old data so our nodes appear more balanced in disk load.

We're also considering just moving our three 'medium' (32-bit) EC2 instances to a single extra-large (64-bit) instance to do what you've suggested, but that would mean moving from a 32-bit platform to a 64-bit platform. Is Cassandra 0.60 going to have problems if we migrate data to a single 64-bit system and then back to several 32-bit systems? (we've looked at replicating our PostgreSQL database, but the binary data files are not compatible between 32-bit and 64-bit systems)

Thanks,
Ian


On 03/18/2011 11:56 AM, Jonathan Ellis wrote:
That should work, but if you have the disk space it's a lot simpler to
just copy all the data files from each machine to a target out of the
cluster, then have the target run cleanup.

On Fri, Mar 18, 2011 at 1:07 PM, ian douglas<i...@armorgames.com>  wrote:
Hi everyone,

I was on the mailing list back in December/January, asking questions about
rebalancing some nodes, etc. We currently have a ring of 3 systems,
redundancy set to 2, and all is well.

We'd like to snapshot our ring and build a new development/staging node from
it (the old dev node is quite stale), and we're curious what the "best
practice" is for something like that.

We're thinking we might replicate our 3 nodes as 3 more new nodes, but on a
whole new ring, then remove one node, issue flush/cleanup commands on the
remaining two (with redundancy set to '2', we should only need to remove one
node, to have all data on both remaining nodes, right?), then tarball the
Cassandra data path from one machine, and download it to a local development
environment.

As long as we're using the same version of Cassandra, is there any drawback
to this approach?

Thanks,
Ian




Reply via email to