Newly added node getting more data than expected

Jens Rantil Sun, 07 Jun 2015 14:21:39 -0700

Hi,

I had a 3-node (à 256 vnodes each) cluster with RF=3. I mistakenly added a
fourth node with "num_tokens: 1" (that is, one vnode). I've always seen
number of vnodes to be proportional to the amount of data a node would
receive. Therefor, I was expecting the node to receive something like
1/(1+3*256) of the cluster's data. However, this is not the case:


$ nodetool status mydatacenter
Datacenter: Cassandra
=====================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address     Load       Tokens  Owns (effective)  Host ID
                Rack
UN  X.X.X.2  200.42 GB  256     87.6%
871968c9-1d6b-4f06-ba90-8b3a8d92dcf0  RAC1
UN  X.X.X.3  198.03 GB  256     53.7%
d7cacd89-8613-4de5-8a5e-a2c53c41ea45  RAC1
UN  X.X.X.4  110.57 GB  1       58.7%
55daa807-af49-44c5-9742-fe456df621a1  RAC1
UN  X.X.X.5  199.81 GB  256     100.0%
 48cb0782-6c9a-4805-9330-38e192b6b680  RAC1

The new node added is "X.X.X.4". Note that I haven't executed `nodetool
cleanup` on the old nodes yet.

Additional information:
 * I am using GossipingPropertyFileSnitch. All nodes are the same
datacenter and rack.
 * There are no pending compactions on the node.

Could anyone explain to me my new node is receiving more data than
expected? Does this have to do with the way the GossipingPropertyFileSnitch
decides where to put secondary/tertiary replicas (ie. always "next physical
node" in ring)? Do I need to execute `nodetool cleanup` also on newly
commissioned nodes?

Thanks,
Jens

-- 
Jens Rantil
Backend engineer
Tink AB

Email: [email protected]
Phone: +46 708 84 18 32
Web: www.tink.se

Facebook <https://www.facebook.com/#!/tink.se> Linkedin
<http://www.linkedin.com/company/2735919?trk=vsrp_companies_res_photo&trkInfo=VSRPsearchId%3A1057023381369207406670%2CVSRPtargetId%3A2735919%2CVSRPcmpt%3Aprimary>
 Twitter <https://twitter.com/tink>

Newly added node getting more data than expected

Reply via email to