Re: Consistent hashing vnodes and ring in cassandra

2019-08-17 Thread Jeff Jirsa
The has results in a token

The replicas are typically the RF instances with tokens numerically larger than 
the hash

So if a row hashes to token 1, and the instances are -100,0,100,200,300,400,500 
the replica instances are 100,200,300, and cassandra considers them to be 
identical for nearly all purposes

Some snitches will skip some of the replicas to satisfy other requirements - if 
200 and 300 are in the same rack (or the same machine because of vnodes), a 
rack aware snitch will choose 100,200,400 instead.


> On Aug 17, 2019, at 12:21 PM, Inquistive allen  wrote:
> 
> Hello Jeff,
> 
> Thanks for the responses.
> I just got it right. One Last thing, when a read request comes in to the 
> coordinator node, the partition key is hashed and a node is located where 
> corresponding data is previously stored. How does the coordinator node locate 
> the replica nodes for this row.
> The first copy of this was written based on hash number, but the replica 
> copies were written based on replication strategy. 
> Will a hash of any partition key list out all nodes where data is present . 
> 
> Thanks
> 
> 
>> On Sun, 18 Aug, 2019, 12:35 AM Jeff Jirsa,  wrote:
>> 
>> 
>> > On Aug 17, 2019, at 10:53 AM, Inquistive allen  
>> > wrote:
>> > 
>> > I am a newbie in cassandra. I have asked this question on various 
>> > platforms but never got a satisfying answer.
>> > Hence thought of bringing up the topic here. Sorry for this might be a 
>> > simple question.
>> > 
>> > 1. I studied the paper on consistent hashing (which is being implemented 
>> > in Cassandra)
>> > 2. Cassandra has the concept of Vnodes. The vnodes( As I understand a 
>> > Vnode is a collection of Hashes) , are the basic blocks of replication in 
>> > cassandra. It is the vnodes which are replicated across the cluster. 
>> > Please do correct me I'm wrong
>> 
>> Vnodes JUST mean each host has more than one token
>> 
>> > 3. Suppose I have a Keyspace A with replication factor 3 and Keyspace B 
>> > with replication factor 2. 
>> > 4. Is it that a Vnode is a collection of hashes of data from various 
>> > Keyspaces.
>> > 5. In that case, Keyspace with varying replication factors , replicating 
>> > them to other nodes would be a problem
>> > 6. Now from the consistent hashing paper, I get a feeling  that , ach 
>> > Keyspace has a different ring. Also the name "KEYSPACE", points to a ring 
>> > of keys in the ring.
>> >So is it that each keyspace has a different ring. If it is so, 
>> > everything else like replicating vnodes among nodes in the cluster would 
>> > fall in place.
>> >Each Keyspace has a different ring ---> each Vnode has data of various 
>> > tables from a given keyspace> hence copies equal to RF is only made in 
>> > the cluster.
>> > 
>> > I know I am missing something. This way of understanding thing might be 
>> > wrong.
>> > Kindly help me understand the same. As this would help me visualise 
>> > repair, bootstrap, adding cluster, streaming operations in a much better 
>> > way.
>> > 
>> 
>> The easiest way to visualize most cassandra operations is to draw the tokens 
>> in a circle. Vnodes means extra tokens
>> 
>> Replica sets are adjacent tokens. You steam from any node in the replica set 
>> in the common replacement case, or the losing replica in the expansion case 
>> 
>> 
>> 
>> -
>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: user-h...@cassandra.apache.org
>> 


Re: Consistent hashing vnodes and ring in cassandra

2019-08-17 Thread Inquistive allen
Hello Jeff,

Thanks for the response.
I just got it right. One Last thing, when a read request comes in to the
coordinator node, the partition key is hashed and a node is located where
corresponding data is previously stored. How does the coordinator node
locate the replica nodes for this row.
The first copy of this was written based on hash number, but the replica
copies were written based on replication strategy.
Will a hash of any partition key list out all nodes where data is present .

Thanks


On Sun, 18 Aug, 2019, 12:35 AM Jeff Jirsa,  wrote:

>
>
> > On Aug 17, 2019, at 10:53 AM, Inquistive allen 
> wrote:
> >
> > I am a newbie in cassandra. I have asked this question on various
> platforms but never got a satisfying answer.
> > Hence thought of bringing up the topic here. Sorry for this might be a
> simple question.
> >
> > 1. I studied the paper on consistent hashing (which is being implemented
> in Cassandra)
> > 2. Cassandra has the concept of Vnodes. The vnodes( As I understand a
> Vnode is a collection of Hashes) , are the basic blocks of replication in
> cassandra. It is the vnodes which are replicated across the cluster. Please
> do correct me I'm wrong
>
> Vnodes JUST mean each host has more than one token
>
> > 3. Suppose I have a Keyspace A with replication factor 3 and Keyspace B
> with replication factor 2.
> > 4. Is it that a Vnode is a collection of hashes of data from various
> Keyspaces.
> > 5. In that case, Keyspace with varying replication factors , replicating
> them to other nodes would be a problem
> > 6. Now from the consistent hashing paper, I get a feeling  that , ach
> Keyspace has a different ring. Also the name "KEYSPACE", points to a ring
> of keys in the ring.
> >So is it that each keyspace has a different ring. If it is so,
> everything else like replicating vnodes among nodes in the cluster would
> fall in place.
> >Each Keyspace has a different ring ---> each Vnode has data of
> various tables from a given keyspace> hence copies equal to RF is only
> made in the cluster.
> >
> > I know I am missing something. This way of understanding thing might be
> wrong.
> > Kindly help me understand the same. As this would help me visualise
> repair, bootstrap, adding cluster, streaming operations in a much better
> way.
> >
>
> The easiest way to visualize most cassandra operations is to draw the
> tokens in a circle. Vnodes means extra tokens
>
> Replica sets are adjacent tokens. You steam from any node in the replica
> set in the common replacement case, or the losing replica in the expansion
> case
>
>
>
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
>


Re: Consistent hashing vnodes and ring in cassandra

2019-08-17 Thread Jeff Jirsa



> On Aug 17, 2019, at 10:53 AM, Inquistive allen  wrote:
> 
> I am a newbie in cassandra. I have asked this question on various platforms 
> but never got a satisfying answer.
> Hence thought of bringing up the topic here. Sorry for this might be a simple 
> question.
> 
> 1. I studied the paper on consistent hashing (which is being implemented in 
> Cassandra)
> 2. Cassandra has the concept of Vnodes. The vnodes( As I understand a Vnode 
> is a collection of Hashes) , are the basic blocks of replication in 
> cassandra. It is the vnodes which are replicated across the cluster. Please 
> do correct me I'm wrong

Vnodes JUST mean each host has more than one token

> 3. Suppose I have a Keyspace A with replication factor 3 and Keyspace B with 
> replication factor 2. 
> 4. Is it that a Vnode is a collection of hashes of data from various 
> Keyspaces.
> 5. In that case, Keyspace with varying replication factors , replicating them 
> to other nodes would be a problem
> 6. Now from the consistent hashing paper, I get a feeling  that , ach 
> Keyspace has a different ring. Also the name "KEYSPACE", points to a ring of 
> keys in the ring.
>So is it that each keyspace has a different ring. If it is so, everything 
> else like replicating vnodes among nodes in the cluster would fall in place.
>Each Keyspace has a different ring ---> each Vnode has data of various 
> tables from a given keyspace> hence copies equal to RF is only made in 
> the cluster.
> 
> I know I am missing something. This way of understanding thing might be wrong.
> Kindly help me understand the same. As this would help me visualise repair, 
> bootstrap, adding cluster, streaming operations in a much better way.
> 

The easiest way to visualize most cassandra operations is to draw the tokens in 
a circle. Vnodes means extra tokens

Replica sets are adjacent tokens. You steam from any node in the replica set in 
the common replacement case, or the losing replica in the expansion case 



-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Consistent hashing vnodes and ring in cassandra

2019-08-17 Thread Inquistive allen
I am a newbie in cassandra. I have asked this question on various platforms but 
never got a satisfying answer.
Hence thought of bringing up the topic here. Sorry for this might be a simple 
question.

1. I studied the paper on consistent hashing (which is being implemented in 
Cassandra)
2. Cassandra has the concept of Vnodes. The vnodes( As I understand a Vnode is 
a collection of Hashes) , are the basic blocks of replication in cassandra. It 
is the vnodes which are replicated across the cluster. Please do correct me I'm 
wrong
3. Suppose I have a Keyspace A with replication factor 3 and Keyspace B with 
replication factor 2. 
4. Is it that a Vnode is a collection of hashes of data from various Keyspaces.
5. In that case, Keyspace with varying replication factors , replicating them 
to other nodes would be a problem
6. Now from the consistent hashing paper, I get a feeling  that , ach Keyspace 
has a different ring. Also the name "KEYSPACE", points to a ring of keys in the 
ring.
So is it that each keyspace has a different ring. If it is so, everything 
else like replicating vnodes among nodes in the cluster would fall in place.
Each Keyspace has a different ring ---> each Vnode has data of various 
tables from a given keyspace> hence copies equal to RF is only made in the 
cluster.

I know I am missing something. This way of understanding thing might be wrong.
Kindly help me understand the same. As this would help me visualise repair, 
bootstrap, adding cluster, streaming operations in a much better way.

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org