Re: Removed node, jumps back into the cluster

2012-09-12 Thread Fredrik Stigbäck
Wrong assumption of me. I found the answer in
GossipDigestSynVerbHandler. I forgot to change the cluster name of the
new cluster.

/Fredrik

2012/9/11 Fredrik fredrik.l.stigb...@sitevision.se:
 I've tested a scenario where I wanted to reuse a removed node in a new
 cluster with same IP, maybe not very common but anyway, found some strange
 behaviour in Gossiper.

 Here is what I think/see happening:
 - Cassandra 1.1. Three node cluster A, B and C.
 - Shutdown node C and remove token for node C.
 - Everything looks ok in logs, reporting that node C is removed etc..
 - Node A and B still sends Gossip digest about the removed node, but I guess
 that's ok since they know about it (Gossiper.endpointStateMap).
 - Node C has status removed when checking in JMX console.
 - Checked in LocationInfo that Ring only contains token/IP for node A and B.
 - Removed system/data tables for C.
 - Changed seed on C to point to itself.
 - Startup node C, node C only gossips itself and node A and B doesn't
 recognize that node C is running, which is correct.
 - Restart e.g. node A. Now node A will loose all gossip information
 (Gossiper.endpointStateMap) about node C. Node A will request information
 from LocationInfo and ask node B
   about endpoint states. Node A will receive information from node B about
 node C, this will trigger Gossiper.handleMajorStateChange and node C will be
 first marked as unreachable
   because it's in dead state (removed), node A will try to Gossip
 (unreachable endpoints) to node C, which will reply that it's up and node C
 becomes incorporated into the old cluster again.

 Is this a a bug or is it a requirement that if you take a node out of the
 cluster you must change IP on the removed node if you want to use it in
 another cluster?
 Please enlight me.

 Regards
 /Fredrik






-- 
Fredrik Larsson Stigbäck
SiteVision AB Vasagatan 10, 107 10 Örebro
019-17 30 30


Removed node, jumps back into the cluster

2012-09-11 Thread Fredrik
I've tested a scenario where I wanted to reuse a removed node in a new 
cluster with same IP, maybe not very common but anyway, found some 
strange behaviour in Gossiper.


Here is what I think/see happening:
- Cassandra 1.1. Three node cluster A, B and C.
- Shutdown node C and remove token for node C.
- Everything looks ok in logs, reporting that node C is removed etc..
- Node A and B still sends Gossip digest about the removed node, but I 
guess that's ok since they know about it (Gossiper.endpointStateMap).

- Node C has status removed when checking in JMX console.
- Checked in LocationInfo that Ring only contains token/IP for node A and B.
- Removed system/data tables for C.
- Changed seed on C to point to itself.
- Startup node C, node C only gossips itself and node A and B doesn't 
recognize that node C is running, which is correct.
- Restart e.g. node A. Now node A will loose all gossip information 
(Gossiper.endpointStateMap) about node C. Node A will request 
information from LocationInfo and ask node B
  about endpoint states. Node A will receive information from node B 
about node C, this will trigger Gossiper.handleMajorStateChange and node 
C will be first marked as unreachable
  because it's in dead state (removed), node A will try to Gossip 
(unreachable endpoints) to node C, which will reply that it's up and 
node C becomes incorporated into the old cluster again.


Is this a a bug or is it a requirement that if you take a node out of 
the cluster you must change IP on the removed node if you want to use it 
in another cluster?

Please enlight me.

Regards
/Fredrik