Thanks for the help, this seems to have worked. Except that while adding the new node we added the same token to a different IP (operational script goofup) and brought the node up, so now the other nodes just had the message that a new IP had taken over the token.
- So we brought it down and fixed it and it all came up fine. - ran removetoken did not finish - so ran removetoken force, that seemed to work - Cleaned up the nodes - Everything from the ring perspective appeared ok on all nodes - except for this error message (which based on some thread it seemed would go away) reported in this thread => http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/0-7-4-Replication-assertion-error-after-removetoken-removetoken-force-and-a-restart-td6311082.html - So I restarted this one node that was complaining (this was not the node that was replaced) - But once this node was restarted, the ring command on it showed the old single token IP (the one we removed). - So I am running the removetoken again , been running for about 2-3 hours now..... the ring shows 113427455640312821154458202477256070485 10.xxx.0.184 Up Normal 829.73 GB 33.33% 0 10.xxx.0.185 Up Normal 576.09 GB 33.33% 56713727820156410577229101238628035241 10.xxx.0.189 Down Leaving 139.73 KB 0.00% 56713727820156410577229101238628035242 10.xxx.0.188 Up Normal 697.41 GB 33.33% 113427455640312821154458202477256070485 What are my choices here, how do I clean up the ring? The other 2 nodes show the ring fine (not even aware of 189) Thanks Anand On Fri, Aug 19, 2011 at 11:53 AM, Anand Somani <meatfor...@gmail.com> wrote: > ok I will go with the IP change strategy and keep you posted. Not going to > manually copy any data, just bring up the node and let it bootstrap. > > Thanks > > > On Fri, Aug 19, 2011 at 11:46 AM, Peter Schuller < > peter.schul...@infidyne.com> wrote: > >> > (Yes, this should definitely be easier. Maybe the most generally >> > useful fix would be for Cassandra to support a node joining the wring >> > in "write-only" mode. This would be useful in other cases, such as >> > when you're trying to temporarily off-load a node by dissabling >> > gossip). >> >> I knew I had read discussions before: >> >> https://issues.apache.org/jira/browse/CASSANDRA-2568 >> >> -- >> / Peter Schuller (@scode on twitter) >> > >