The documentation for that section needs to be updated... What happens is that if you just autobootstrap without setting a token it will by default bisect the range of the largest node.
So if you go through several iterations of adding nodes, then this is what you would see: Gen 1: Node A: 100% of tokens, token range 1-10 (for example) Gen 2: Node A: 50% of tokens (1-5) Node B: 50% of tokens (6-10) Gen 3: Node A: 25% of tokens (1-2.5) Node B: 50% of tokens (6-10) Node C: 25% of tokens (2.6-5) In reality, what you'd want in gen 3 is every node to be 33%, but it would not be the case without setting the tokens to begin with. You'll notice that there are a couple of scripts available to generate a list of initial tokens for your particular cluster size, then ever time you add a node you'll need to update all the nodes with new tokens in order to properly load balance. Does this make sense? Other folks, am I explaining this correctly? David 2012/1/13 Carlos Pérez Miguel <cperez...@gmail.com> > Hello, > > I have a doubt about how initial token is determined. In Cassandra's > documentation it is said that it is better to manually configure the > initial token to each node in the system but also is said that if > initial token is not defined and autobootstrap is true, new nodes > choose initial token in order to better the load balance of the > cluster. But what happens if no initial token is chosen and > autobootstrap is not activated? How each node selects its initial > token to balance the ring? > > I ask this because I am making tests with a 20 nodes cassandra cluster > with cassandra 0.7.9. Any node has initial token, nor > autobootstraping. I restart the cluster with each test I want to make > and in the end the cluster is always well balanced. > > Thanks > > Carlos Pérez Miguel >