Thanks Ben. That's what I was afraid I had to do. I can see how it's a lot easier if you simply double the cluster when adding capacity.
Jon On Jun 9, 2011, at 4:44 PM, Benjamin Coverston wrote: > Because you were able to successfully run repair you can follow up with a > nodetool cleanup which will git rid of some of the extraneous data on that > (bigger) node. You're also assured after you run repair that entropy beteen > the nodes is minimal. > > Assuming you're using the random ordered partitioner: To balance your ring I > would start by calculating the new token locations, then moving each of your > nodes backwards along their owned range to their new locations. > > From the script on http://wiki.apache.org/cassandra/Operations your new > balanced tokens would be: > > 0 > 21267647932558653966460912964485513216 > 42535295865117307932921825928971026432 > 63802943797675961899382738893456539648 > 85070591730234615865843651857942052864 > 106338239662793269832304564822427566080 > 127605887595351923798765477786913079296 > 148873535527910577765226390751398592512 > > From this you can see that 10.46.108.{100, 101} is already in the right > place so you don't have to do anything with those nodes. Proceed with moving > 10.46.108.104 to its new token, the safest way to do this would be to use > nodetool move. Another way to do it could be to run a remove-token followed > by re-adding the node into the ring at its new location. The risk here is > that if you do not at least repair after re-joining the ring (and before you > move the next node in the ring) then some of the data on that node would be > ignored as it would now fall out of the owned range, so it's good practice to > immediately run repair on a node that you do a removetoken / re-join on. > > The rest of your balancing should be an iteration on the above steps moving > through the range. > > > On 6/9/11 6:21 AM, Jonathan Colby wrote: >> I got myself into a situation where one node (10.47.108.100) has a lot more >> data than the other nodes. In fact, the 1 TB disk on this node is almost >> full. I added 3 new nodes and let cassandra automatically calculate new >> tokens by taking the highest loaded nodes. Unfortunately there is still a >> big token range this node is responsible for (5113... - 85070...). Yes, I >> know that one option would be to rebalance the entire cluster with move but >> this is an extremely time-consuming and error-prone process because of the >> amount of data involved. >> >> Our RF = 3 and we read/write quorum. The nodes have been repaired so I >> think the data should be in good shape. >> >> Question: Can I get myself out of this mess without installing new nodes? >> I was thinking of either decommission or removetoken to have the cluster >> "rebalance itself". The re-bootstrap this node to a new token. >> >> >> Address Status State Load Owns Token >> >> 127605887595351923798765477786913079296 >> 10.46.108.100 Up Normal 218.52 GB 25.00% 0 >> 10.46.108.101 Up Normal 260.04 GB 12.50% >> 21267647932558653966460912964485513216 >> 10.46.108.104 Up Normal 286.79 GB 17.56% >> 51138582157040063602728874106478613120 >> 10.47.108.100 Up Normal 874.91 GB 19.94% >> 85070591730234615865843651857942052863 >> 10.47.108.102 Up Normal 302.79 GB 4.16% >> 92156241323118845370666296304459139297 >> 10.47.108.103 Up Normal 242.02 GB 4.16% >> 99241191538897700272878550821956884116 >> 10.47.108.101 Up Normal 439.9 GB 8.34% >> 113427455640312821154458202477256070484 >> 10.46.108.103 Up Normal 304 GB 8.33% >> 127605887595351923798765477786913079296 > > -- > Ben Coverston > Director of Operations > DataStax -- The Apache Cassandra Company > http://www.datastax.com/ >