The new node will own some parts (ranges) of the ring according to the ring tokens the node is responsible for. These tokens are defined from the yaml property initial_token (manual assignment) or num_tokens (random assignment).
During the bootstrap process raw data from sstables sections containing the ranges the node is responsible for are transferred from nodes that previously owned the range to the new node so the source sstables are rebuilt in the joining node. After each sstable is transferred the new node it rebuilds primary and secondary indexes, bloom filters, etc and in the end of the bootstrap process the new sstables are added to the live data set. See org.apache.cassandra.dht.BootStrapper.java and org.apache.cassandra.streaming.StreamReceiveTask of the trunk branch for more information. ps: I don't particularly recall any document with specific details, so if anyone knows please be welcome to share. If you want more theoretical information, see the ring membership sections of the cassandra and/or dynamo paper. 2015-12-24 13:14 GMT-02:00 Sergi Vladykin <[email protected]>: > Guys, > > I was not able to find in docs or in google detailed description of data > rebalancing algorithm. > > I mean how Cassandra moves SSTables when new node connects to the cluster, > how > primary and secondary indexes are getting transfered to this new node, > etc.. > > Can anyone provide relevant links please or just reply here? > > I can read source code of course, but it would be nice if someone could > answer right away :) > > Sergi >
