Hi,
I had a 3-node (à 256 vnodes each) cluster with RF=3. I mistakenly added a
fourth node with "num_tokens: 1" (that is, one vnode). I've always seen
number of vnodes to be proportional to the amount of data a node would
receive. Therefor, I was expecting the node to receive something like
1/(1+3*256) of the cluster's data. However, this is not the case:
$ nodetool status mydatacenter
Datacenter: Cassandra
=====================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID
Rack
UN X.X.X.2 200.42 GB 256 87.6%
871968c9-1d6b-4f06-ba90-8b3a8d92dcf0 RAC1
UN X.X.X.3 198.03 GB 256 53.7%
d7cacd89-8613-4de5-8a5e-a2c53c41ea45 RAC1
UN X.X.X.4 110.57 GB 1 58.7%
55daa807-af49-44c5-9742-fe456df621a1 RAC1
UN X.X.X.5 199.81 GB 256 100.0%
48cb0782-6c9a-4805-9330-38e192b6b680 RAC1
The new node added is "X.X.X.4". Note that I haven't executed `nodetool
cleanup` on the old nodes yet.
Additional information:
* I am using GossipingPropertyFileSnitch. All nodes are the same
datacenter and rack.
* There are no pending compactions on the node.
Could anyone explain to me my new node is receiving more data than
expected? Does this have to do with the way the GossipingPropertyFileSnitch
decides where to put secondary/tertiary replicas (ie. always "next physical
node" in ring)? Do I need to execute `nodetool cleanup` also on newly
commissioned nodes?
Thanks,
Jens
--
Jens Rantil
Backend engineer
Tink AB
Email: [email protected]
Phone: +46 708 84 18 32
Web: www.tink.se
Facebook <https://www.facebook.com/#!/tink.se> Linkedin
<http://www.linkedin.com/company/2735919?trk=vsrp_companies_res_photo&trkInfo=VSRPsearchId%3A1057023381369207406670%2CVSRPtargetId%3A2735919%2CVSRPcmpt%3Aprimary>
Twitter <https://twitter.com/tink>