> 2) When I brought 2 nodes down (out of 3), I was able to start one node
> (with 66 % load below) even though auto_bootstrap is set to true. Shouldn't
> it have failed for the same reason?

This is a good point/question. As far as I can tell, a node being
bootstrapped would need to receive data from a sufficient number of
replicas to satisfy the maximum consistently level that the
application(s) use, in order to avoid the potential for violating the
consistency requirement expected by clients. Not knowing what the
application expects, that would imply a quorum of nodes.

I just checked the code, and my reading (untested) is that the intent
is to receive data from all nodes responsible for the part of the ring
that is being taken over. Meaning, it satisfies the above requirement.

However, that reading is inconsistent with your test which suggests
you were able to bootstrap with two nodes missing out of three.

Is your nodetool output from the new node or the pre-existing online
node? It only lists two nodes, rather than 3 or 4 (with some being
Down). If the only remaining node doesn't know about the other two
that are down, that may explain it.

I may be mis-reading the code because it's suddenly unclear to me how
this is supposed to work with respect to nodes being down (supposing
it's truly down, forever, and needs to be replaced).

Anyone?

-- 
/ Peter Schuller

Reply via email to