[jira] Commented: (CASSANDRA-150) multiple seeds (only when seed count = node count?) can cause cluster partition

Jaakko Laine (JIRA) Tue, 24 Nov 2009 06:02:05 -0800

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12781937#action_12781937
 ]


Jaakko Laine commented on CASSANDRA-150:
----------------------------------------

Network partition may happen if (1) cluster size is at least four nodes, (2) 
all nodes are seeds and (3) at least two nodes boot "simultaneously".

Gossiping cycle works as follows:
(i) gossip to random live node
(ii) gossip to random unreachable node
(iii) if the node gossiped to at (i) was not seed, gossip to random seed

Suppose there are four nodes in the cluster: nodeA, nodeB, nodeC and nodeD, all 
of them seeds. Suppose they are all brought online at the same time. Following 
event sequence leads to partition:

(1) nodeA comes online. No live nodes (and no unreachable either, of course), 
so gossip to random seed. Let's suppose nodeA chooses nodeB. It sends nodeB 
gossip.
(2) nodeB gets nodeA's gossip and marks it live. It sends its own gossip, and 
since it has a live node (nodeA), it sends gossip according to gossip's first 
rule. nodeA is seed, so no gossip is sent to random seed at (iii).
(3) nodeC comes online. It has not seen other live nodes yet, so it will gossip 
to random seed. Let's suppose it chooses nodeD.
(4) nodeD comes online and sees nodeC's gossip. Since it now has a live node, 
it will send nodeC gossip according to the first rule. Since nodeC is seed, 
again no gossip is sent to random seed.

(there are other sequences as well, but basic idea is the same)

Now all nodes know of one live node, so they will always send gossip according 
to the first rule. Since this node is seed, they will never send gossip to 
random seed according to rule three. This will prevent them from finding rest 
of the cluster. One non-seed node will break this loop, as gossip sent to it 
will trigger gossip to random seed.

While investigating this, I noticed we might have caused some harm to 
scalability of gossip mechanism when we added two new application states for 
node movement. I'll fix this bug tomorrow when checking if there is a problem.


> multiple seeds (only when seed count = node count?) can cause cluster 
> partition
> -------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-150
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-150
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Jonathan Ellis
>            Priority: Minor
>
> happens fairly frequently on my test cluster of 5 nodes.  (i normally restart 
> all nodes at once when updating the code.  haven't tested w/ restarting one 
> machine at a time.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-150) multiple seeds (only when seed count = node count?) can cause cluster partition

Reply via email to