Re: Rebalance a cassandra cluster

2017-09-15 Thread Anthony Grasso
As Kurt mentioned, you definitely need to pick a partition key that ensure
data is uniformly distributed.

If you want to want to redistribute the data in cluster and move tokens
around, you could decommission the node with the tokens you want to
redistribute and then bootstrap a new node into the cluster. However, be
careful, because if there are unbalanced partitions in the cluster
redistributing the tokens will just move the problem partition to another
node. In this case, the same problem will occur on the node that picks up
the problem partition key and you will be back in the same situation again.

Regards,
Anthony

On 13 September 2017 at 20:09, kurt greaves  wrote:

> You should choose a partition key that enables you to have a uniform
> distribution of partitions amongst the nodes and refrain from having too
> many wide rows/a small number of wide partitions. If your tokens are
> already uniformly distributed, recalculating in order to achieve a better
> data load balance is probably going to be an effort in futility, plus not
> really a good idea from a maintenance and scaling perspective.​
>


Re: Rebalance a cassandra cluster

2017-09-13 Thread kurt greaves
You should choose a partition key that enables you to have a uniform
distribution of partitions amongst the nodes and refrain from having too
many wide rows/a small number of wide partitions. If your tokens are
already uniformly distributed, recalculating in order to achieve a better
data load balance is probably going to be an effort in futility, plus not
really a good idea from a maintenance and scaling perspective.​


Rebalance a cassandra cluster

2017-09-13 Thread Akshit Jain
Suppose I have a cassandra cluster with the data that is skewed such that
one node have 40% more data than other nodes.Since while creating the
cassandra the tokens were distributed uniformly.
Now to make the data uniform I have to recalculate the tokens and assign
them to nodes in the cluster. Then run repair and cleanup.
The question is How to recalculate the tokens and assign them to
nodes(Keeping cost ,distance between nodes and data movement in mind).


Re: Rebalance a cassandra cluster

2017-09-13 Thread Akshit Jain
Suppose I have a cassandra cluster with the data that is skewed such that
one node have 40% more data than other nodes.Since while creating the
cassandra the tokens were distributed uniformly.
Now to make the data uniform I have to recalculate the tokens and assign
them to nodes in the cluster. Then run repair and cleanup.
The question is How to recalculate the tokens and assign them to
nodes(Keeping cost ,distance between nodes and data movement in mind)

Regards
Akshit Jain
B-Tech,2013124
9891724697


On Wed, Sep 13, 2017 at 11:54 AM, Hannu Kröger <hkro...@gmail.com> wrote:

> Hi,
>
> you should make sure that token range is evenly distributed if you have a
> single token configured per node. You can use e.g. this tool to calculate
> tokens:
> https://www.geroba.com/cassandra/cassandra-token-calculator/
>
> Also, make sure that none of the partitions in your data model are
> hotspots that contain a lot more data than on average. Check also
> materialized views if you use them.
>
> Also, due to way the compactions work, it’s normal that the disk usage
> goes up and down. Since nodes often do that in different rhythms, you
> always see that some node(s) are using more disk space than others if some
> point of time especially if you do updates and not just inserts.
>
> Cheers,
> Hannu
>
> On 13 September 2017 at 07:47:09, Akshit Jain (akshit13...@iiitd.ac.in)
> wrote:
>
> Hi,
> Can a cassandra cluster be unbalanced in terms of data?
> If yes then how to rebalance a cassandra cluster.
>
>


Re: Rebalance a cassandra cluster

2017-09-13 Thread Hannu Kröger
Hi,

you should make sure that token range is evenly distributed if you have a
single token configured per node. You can use e.g. this tool to calculate
tokens:
https://www.geroba.com/cassandra/cassandra-token-calculator/

Also, make sure that none of the partitions in your data model are hotspots
that contain a lot more data than on average. Check also materialized views
if you use them.

Also, due to way the compactions work, it’s normal that the disk usage goes
up and down. Since nodes often do that in different rhythms, you always see
that some node(s) are using more disk space than others if some point of
time especially if you do updates and not just inserts.

Cheers,
Hannu

On 13 September 2017 at 07:47:09, Akshit Jain (akshit13...@iiitd.ac.in)
wrote:

Hi,
Can a cassandra cluster be unbalanced in terms of data?
If yes then how to rebalance a cassandra cluster.


RE: Rebalance a cassandra cluster

2017-09-13 Thread Harika Vangapelli -T (hvangape - AKRAYA INC at Cisco)
Check with nodetool repair.

[http://wwwin.cisco.com/c/dam/cec/organizations/gmcc/services-tools/signaturetool/images/logo/logo_gradient.png]



Harika Vangapelli
Engineer - IT
hvang...@cisco.com<mailto:hvang...@cisco.com>
Tel:

Cisco Systems, Inc.



United States
cisco.com


[http://www.cisco.com/assets/swa/img/thinkbeforeyouprint.gif]Think before you 
print.

This email may contain confidential and privileged material for the sole use of 
the intended recipient. Any review, use, distribution or disclosure by others 
is strictly prohibited. If you are not the intended recipient (or authorized to 
receive for the recipient), please contact the sender by reply email and delete 
all copies of this message.
Please click 
here<http://www.cisco.com/web/about/doing_business/legal/cri/index.html> for 
Company Registration Information.


From: Akshit Jain [mailto:akshit13...@iiitd.ac.in]
Sent: Tuesday, September 12, 2017 9:47 PM
To: user@cassandra.apache.org
Subject: Rebalance a cassandra cluster

Hi,
Can a cassandra cluster be unbalanced in terms of data?
If yes then how to rebalance a cassandra cluster.



Rebalance a cassandra cluster

2017-09-12 Thread Akshit Jain
Hi,
Can a cassandra cluster be unbalanced in terms of data?
If yes then how to rebalance a cassandra cluster.