Re: How does a node decide where each of its vnodes will be replicated to?

2021-11-08 Thread manish khandelwal
Just to add on to your response: *num_tokens* define the number of vnodes a node can have. Default is 256. *Initial token* range is predefined (For murmur -2**63 to 2**63-1) So if you have one node in (does not make sense) cluster with num_tokens as 256 then you will have 256vnodes. Scaling up

Re: How does a node decide where each of its vnodes will be replicated to?

2021-11-08 Thread Tech Id
I will try to give a made-up example to show what I understand. Let us assume our hash function outputs a number between 1 to 10,000 So hash(primary-key) is between 1 and 10,000 Prior to vnodes, the above 1 to 10k range was split among the nodes. With vnodes, this 10k range is now split into say

Re: How does a node decide where each of its vnodes will be replicated to?

2021-11-08 Thread Jeff Jirsa
I think your mental model here is trying to map a different db concept (like elasticsearch shards) to a distributed hash table that doesnt really map that way. There's no physical thing as a vnode. Vnode, as a concept, is "a single node runs multiple tokens and owns multiple ranges". Multiple

Re: How does a node decide where each of its vnodes will be replicated to?

2021-11-08 Thread Tech Id
Thanks Jeff. One follow-up question please: Each node specifies num_tokens. So if there are 4 nodes and each specifies 256 tokens, then it means together they are responsible for 1024 vnodes. Now, when a fifth node joins and has num_tokens set to 256 as well, then does the system have 1024+256 =

Re: How does a node decide where each of its vnodes will be replicated to?

2021-11-08 Thread Jeff Jirsa
When a machine starts for the first time, the joining node basically chooses a number of tokens (num_tokens) randomly within the range of the partitioner (for murmur3, -2**63 to 2**63), and then bootstraps to claim them. This is sort of a lie, in newer versions, we try to make it a bit more

Re: How does a node decide where each of its vnodes will be replicated to?

2021-11-08 Thread Tech Id
Thanks Jeff. I think what you explained below is before and after vnodes introduction. The vnodes part is clear - how each node holds a small range of tokens and how each node holds a discontiguous set of vnodes. 1. What is not clear is how each node decided what vnodes it will get. If it

Re: How does a node decide where each of its vnodes will be replicated to?

2021-11-08 Thread Jeff Jirsa
Vnodes are implemented by giving a single process multiple tokens. Tokens ultimately determine which data lives on which node. When you hash a partition key, it gives you a token (let's say 570). The 3 processes that own token 57 are the next 3 tokens in the ring ABOVE 570, so if you had A = 0 B

How does a node decide where each of its vnodes will be replicated to?

2021-11-08 Thread Tech Id
Hello, Going through https://docs.datastax.com/en/cassandra-oss/3.0/cassandra/architecture/archDataDistributeDistribute.html . But it is not clear how a node decides where each of its vnodes will be replicated to. As an example from the above page: 1. Why is vnode A present in nodes 1,2 and