Data ownership is defined by the token ring concept.

Hosts in the cluster may have tokens - let's oversimplify to 5 hosts, each
with 1 token A=0, B=1000, C=2000, D=3000, E=4000

The partition key is hashed to calculate the token, and the next 3 hosts in
the ring are the "owners" of that data - a key that hashes to 1234 would be
found on hosts C, D, E

Anytime hosts move tokens (joining/expansion, leaving/shrink,
re-arranging/moves), the tokens go into a pending state.

So if you were to add a 6th host here, let's say F=2500, when it first
gossips, it'll have 2500 in a different JOINING (pending) state. In that
state, it won't get any read traffic, but the quorum calculations will be
augmented to send extra writes - instead of needing 2/3 nodes to ack any
write, it'll require a total of 3 acks (of the 4 possible replicas, the 3
natural replicas and the 1 pending replica).

When the node finishes joining, and it gossips its state=NORMAL, it'll be
removed from pending, and the reads will move to it instead.

The gossip state transition from pending to normal isn't exact, it's
propagated via gossip (so it's seconds of change where reads/writes can hit
either replica), but the increase in writes (writing to both destinations)
should make it safe in that transition. It's being rewritten to be
transactional in an upcoming version of cassandra.



On Tue, Oct 24, 2023 at 11:39 PM Vikas Kumar <hers...@gmail.com> wrote:

> Hi folks,
>
> I am looking for some resources to understand the internals of rebalancing
> in Cassandra. Specifically:
>
>  - How are read and write queries served during data migration?
>  - How is the cutover from the current node to the new node performed?
>
> Any help is greatly appreciated.
>
> Thanks,
> Vikas
>

Reply via email to