Re: Rebalance a cassandra cluster
As Kurt mentioned, you definitely need to pick a partition key that ensure data is uniformly distributed. If you want to want to redistribute the data in cluster and move tokens around, you could decommission the node with the tokens you want to redistribute and then bootstrap a new node into the cluster. However, be careful, because if there are unbalanced partitions in the cluster redistributing the tokens will just move the problem partition to another node. In this case, the same problem will occur on the node that picks up the problem partition key and you will be back in the same situation again. Regards, Anthony On 13 September 2017 at 20:09, kurt greaveswrote: > You should choose a partition key that enables you to have a uniform > distribution of partitions amongst the nodes and refrain from having too > many wide rows/a small number of wide partitions. If your tokens are > already uniformly distributed, recalculating in order to achieve a better > data load balance is probably going to be an effort in futility, plus not > really a good idea from a maintenance and scaling perspective. >
Re: Rebalance a cassandra cluster
You should choose a partition key that enables you to have a uniform distribution of partitions amongst the nodes and refrain from having too many wide rows/a small number of wide partitions. If your tokens are already uniformly distributed, recalculating in order to achieve a better data load balance is probably going to be an effort in futility, plus not really a good idea from a maintenance and scaling perspective.
Rebalance a cassandra cluster
Suppose I have a cassandra cluster with the data that is skewed such that one node have 40% more data than other nodes.Since while creating the cassandra the tokens were distributed uniformly. Now to make the data uniform I have to recalculate the tokens and assign them to nodes in the cluster. Then run repair and cleanup. The question is How to recalculate the tokens and assign them to nodes(Keeping cost ,distance between nodes and data movement in mind).
Re: Rebalance a cassandra cluster
Suppose I have a cassandra cluster with the data that is skewed such that one node have 40% more data than other nodes.Since while creating the cassandra the tokens were distributed uniformly. Now to make the data uniform I have to recalculate the tokens and assign them to nodes in the cluster. Then run repair and cleanup. The question is How to recalculate the tokens and assign them to nodes(Keeping cost ,distance between nodes and data movement in mind) Regards Akshit Jain B-Tech,2013124 9891724697 On Wed, Sep 13, 2017 at 11:54 AM, Hannu Kröger <hkro...@gmail.com> wrote: > Hi, > > you should make sure that token range is evenly distributed if you have a > single token configured per node. You can use e.g. this tool to calculate > tokens: > https://www.geroba.com/cassandra/cassandra-token-calculator/ > > Also, make sure that none of the partitions in your data model are > hotspots that contain a lot more data than on average. Check also > materialized views if you use them. > > Also, due to way the compactions work, it’s normal that the disk usage > goes up and down. Since nodes often do that in different rhythms, you > always see that some node(s) are using more disk space than others if some > point of time especially if you do updates and not just inserts. > > Cheers, > Hannu > > On 13 September 2017 at 07:47:09, Akshit Jain (akshit13...@iiitd.ac.in) > wrote: > > Hi, > Can a cassandra cluster be unbalanced in terms of data? > If yes then how to rebalance a cassandra cluster. > >
Re: Rebalance a cassandra cluster
Hi, you should make sure that token range is evenly distributed if you have a single token configured per node. You can use e.g. this tool to calculate tokens: https://www.geroba.com/cassandra/cassandra-token-calculator/ Also, make sure that none of the partitions in your data model are hotspots that contain a lot more data than on average. Check also materialized views if you use them. Also, due to way the compactions work, it’s normal that the disk usage goes up and down. Since nodes often do that in different rhythms, you always see that some node(s) are using more disk space than others if some point of time especially if you do updates and not just inserts. Cheers, Hannu On 13 September 2017 at 07:47:09, Akshit Jain (akshit13...@iiitd.ac.in) wrote: Hi, Can a cassandra cluster be unbalanced in terms of data? If yes then how to rebalance a cassandra cluster.
RE: Rebalance a cassandra cluster
Check with nodetool repair. [http://wwwin.cisco.com/c/dam/cec/organizations/gmcc/services-tools/signaturetool/images/logo/logo_gradient.png] Harika Vangapelli Engineer - IT hvang...@cisco.com<mailto:hvang...@cisco.com> Tel: Cisco Systems, Inc. United States cisco.com [http://www.cisco.com/assets/swa/img/thinkbeforeyouprint.gif]Think before you print. This email may contain confidential and privileged material for the sole use of the intended recipient. Any review, use, distribution or disclosure by others is strictly prohibited. If you are not the intended recipient (or authorized to receive for the recipient), please contact the sender by reply email and delete all copies of this message. Please click here<http://www.cisco.com/web/about/doing_business/legal/cri/index.html> for Company Registration Information. From: Akshit Jain [mailto:akshit13...@iiitd.ac.in] Sent: Tuesday, September 12, 2017 9:47 PM To: user@cassandra.apache.org Subject: Rebalance a cassandra cluster Hi, Can a cassandra cluster be unbalanced in terms of data? If yes then how to rebalance a cassandra cluster.
Rebalance a cassandra cluster
Hi, Can a cassandra cluster be unbalanced in terms of data? If yes then how to rebalance a cassandra cluster.