Thanks for sharing this clear explanation, Jeff. Cheers! On Wed, Oct 25, 2023 at 5:58 PM Jeff Jirsa <jji...@gmail.com> wrote:
> Data ownership is defined by the token ring concept. > > Hosts in the cluster may have tokens - let's oversimplify to 5 hosts, each > with 1 token A=0, B=1000, C=2000, D=3000, E=4000 > > The partition key is hashed to calculate the token, and the next 3 hosts > in the ring are the "owners" of that data - a key that hashes to 1234 would > be found on hosts C, D, E > > Anytime hosts move tokens (joining/expansion, leaving/shrink, > re-arranging/moves), the tokens go into a pending state. > > So if you were to add a 6th host here, let's say F=2500, when it first > gossips, it'll have 2500 in a different JOINING (pending) state. In that > state, it won't get any read traffic, but the quorum calculations will be > augmented to send extra writes - instead of needing 2/3 nodes to ack any > write, it'll require a total of 3 acks (of the 4 possible replicas, the 3 > natural replicas and the 1 pending replica). > > When the node finishes joining, and it gossips its state=NORMAL, it'll be > removed from pending, and the reads will move to it instead. > > The gossip state transition from pending to normal isn't exact, it's > propagated via gossip (so it's seconds of change where reads/writes can hit > either replica), but the increase in writes (writing to both destinations) > should make it safe in that transition. It's being rewritten to be > transactional in an upcoming version of cassandra. > > > > On Tue, Oct 24, 2023 at 11:39 PM Vikas Kumar <hers...@gmail.com> wrote: > >> Hi folks, >> >> I am looking for some resources to understand the internals of >> rebalancing in Cassandra. Specifically: >> >> - How are read and write queries served during data migration? >> - How is the cutover from the current node to the new node performed? >> >> Any help is greatly appreciated. >> >> Thanks, >> Vikas >> >