The definition of "down" is important here. 

Down refers to a node that has joined the ring, so the other nodes know of it's 
existence and the range it is storing, which is not responding to gossip 
messages. While it is down it is still considered an endpoint. The error you 
and Patrik saw refers to the number of endpoints in the ring, not the number of 
Up nodes. When doing dev I have a 2 nodes cluster on my laptop with rf=2, it's 
fine to bring the nodes in the cluster up one at a time. 

The issue I think you and Patrik are seeing occurs when you *remove* nodes from 
the ring. The ring does not know if they are up or down. E.g. you have a ring 
of 3 nodes, and add a keyspace with RF 3. Then for whatever reason 2 nodes are 
removed from the ring. When bootstrapping a node into this ring it will fail 
because it detects the cluster does not have enough *endpoints* (different to 
up nodes) to support the keyspace. 

One thing I want to double check is that the node doing the bootstrap considers 
it's self when calculating the number of end points. Some of the things you and 
Patrik said about bootstrapping node 3 into a ring of 3 with rf=3 made me want 
to check. 

IMHO bootstrapping is the process of pulling data the *new* node is responsible 
for from other nodes in the ring. This is different to joining the ring. 

Hope that helps.
Aaron


On 9/03/2011, at 10:54 AM, mcasandra wrote:

> I think this not the right functionality and it is really odd that you can't
> successfully bring it online without turning off bootstrap BUT you can bring
> it online by turning auto_boostrap off and then run nodetool repair
> afterwards.
> 
> Also, if that's the case then when one node goes down, say out of 3 one node
> goes down then should cassandra eject other nodes as well?? Why should
> cassandra exit on startup? That node could at least serve other keyspaces
> and alleviate load while returning errors to the client for those keyspaces
> where RF cannot be met. 
> 
> As noted in my other post regarding similar issue that I reported, I have
> also seen wierd behaviour where I had 2 nodes down out of 3 and I was able
> to bring up one of the nodes except the remaining one. You would think that
> no nodes will come up but I really think there is a problem here.
> 
> 
> 
> --
> View this message in context: 
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/problem-with-bootstrap-tp6127315p6145100.html
> Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
> Nabble.com.

Reply via email to