Clarification on num_tokens setting
As I understand the num_tokens setting, it makes Cassandra do the following pseudocode when a new node is added: for 1...num_tokens do my_token = rand(0, 2^128-1) next_token = min(tokens in cluster where token my_token) my_range = (my_token, next_token - 1) done Now the new node owns num_tokens chunks of keys that previously belonged to other nodes. My point is, with 1 node in the cluster, the ring is divided into num_tokens ranges. With N nodes, the ring is divided into N*num_tokens. Correct? The docs do not make this clear for me. And another point: the tokens are randomly chosen, so the ranges of keys are not uniform, although with enough nodes in the cluster there probably won't be any really large ranges. Correct?
Re: Clarification on num_tokens setting
With N nodes, the ring is divided into N*num_tokens. Correct? There is always num_tokens tokens in the ring. Each node has (num_tokens / N) * RF ranges on it. so the ranges of keys are not uniform, although with enough nodes in the cluster there probably won't be any really large ranges. Correct? Even without vnodes there is no guarantee that nodes had contiguous key ranges. Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 6/02/2013, at 5:43 AM, Baron Schwartz ba...@xaprb.com wrote: As I understand the num_tokens setting, it makes Cassandra do the following pseudocode when a new node is added: for 1...num_tokens do my_token = rand(0, 2^128-1) next_token = min(tokens in cluster where token my_token) my_range = (my_token, next_token - 1) done Now the new node owns num_tokens chunks of keys that previously belonged to other nodes. My point is, with 1 node in the cluster, the ring is divided into num_tokens ranges. With N nodes, the ring is divided into N*num_tokens. Correct? The docs do not make this clear for me. And another point: the tokens are randomly chosen, so the ranges of keys are not uniform, although with enough nodes in the cluster there probably won't be any really large ranges. Correct?
Re: Clarification on num_tokens setting
On Tue, Feb 5, 2013 at 12:42 PM, aaron morton aa...@thelastpickle.comwrote: With N nodes, the ring is divided into N*num_tokens. Correct? There is always num_tokens tokens in the ring. Each node has (num_tokens / N) * RF ranges on it. That means every node should have the same num_token parameter? In other words it is cluster wide parameter. Correct? Thank you, Andrey
Re: Clarification on num_tokens setting
There is always num_tokens tokens in the ring. I got this wrong. Each node *does* have num_tokens tokens. With N nodes, the ring is divided into N*num_tokens. Correct? Yes In other words it is cluster wide parameter. Correct? Yes. Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 6/02/2013, at 10:36 AM, Andrey Ilinykh ailin...@gmail.com wrote: On Tue, Feb 5, 2013 at 12:42 PM, aaron morton aa...@thelastpickle.com wrote: With N nodes, the ring is divided into N*num_tokens. Correct? There is always num_tokens tokens in the ring. Each node has (num_tokens / N) * RF ranges on it. That means every node should have the same num_token parameter? In other words it is cluster wide parameter. Correct? Thank you, Andrey
Re: Clarification on num_tokens setting
On Tue, Feb 5, 2013 at 4:19 PM, aaron morton aa...@thelastpickle.com wrote: There is always num_tokens tokens in the ring. I got this wrong. Each node *does* have num_tokens tokens. With N nodes, the ring is divided into N*num_tokens. Correct? Yes In other words it is cluster wide parameter. Correct? Yes. Actually, num_tokens is a per node setting. It might make sense for example to assign different numbers of tokens in a cluster with heterogeneous hardware, but I would urge caution as there is currently no way post facto way to increase or decrease a nodes token count. -- Eric Evans Acunu | http://www.acunu.com | @acunu