Re: Unbalanced cluster

2017-07-11 Thread Jonathan Haddad
Awesome utility Avi! Thanks for sharing. On Tue, Jul 11, 2017 at 10:57 AM Avi Kivity wrote: > There is now a readme with some examples and a build file. > > On 07/11/2017 11:53 AM, Avi Kivity wrote: > > Yeah, posting a github link carries an implied undertaking to write a >

Re: Unbalanced cluster

2017-07-11 Thread Avi Kivity
There is now a readme with some examples and a build file. On 07/11/2017 11:53 AM, Avi Kivity wrote: Yeah, posting a github link carries an implied undertaking to write a README file and make it easily buildable. I'll see what I can do. On 07/11/2017 06:25 AM, Nate McCall wrote: You

Re: Unbalanced cluster

2017-07-11 Thread Avi Kivity
Yeah, posting a github link carries an implied undertaking to write a README file and make it easily buildable. I'll see what I can do. On 07/11/2017 06:25 AM, Nate McCall wrote: You wouldnt have a build file laying around for that, would you? On Tue, Jul 11, 2017 at 3:23 PM, Nate McCall

Re: Unbalanced cluster

2017-07-11 Thread Avi Kivity
It is ScyllaDB specific. Scylla divides data not only among nodes, but also internally within a node among cores (=shards in our terminology). In the past we had problems with shards being over- and under-utilized (just like your cluster), so this simulator was developed to validate the

Re: Unbalanced cluster

2017-07-11 Thread Loic Lambiel
Thanks for the hint and tool ! By the way, what does the --shards parameter means ? Thanks Loic On 07/10/2017 05:20 PM, Avi Kivity wrote: > 32 tokens is too few for 33 nodes. I have a sharding simulator [1] and > it shows > > > $ ./shardsim --vnodes 32 --nodes 33 --shards 1 > 33 nodes, 32

Re: Unbalanced cluster

2017-07-10 Thread Nate McCall
You wouldnt have a build file laying around for that, would you? On Tue, Jul 11, 2017 at 3:23 PM, Nate McCall wrote: > On Tue, Jul 11, 2017 at 3:20 AM, Avi Kivity wrote: > >> >> >> >> [1] https://github.com/avikivity/shardsim >> > > Avi, that's super

Re: Unbalanced cluster

2017-07-10 Thread Nate McCall
On Tue, Jul 11, 2017 at 3:20 AM, Avi Kivity wrote: > > > > [1] https://github.com/avikivity/shardsim > Avi, that's super handy - thanks for posting.

Re: Unbalanced cluster

2017-07-10 Thread kurt greaves
the reason for the default of 256 vnodes is because at that many tokens the random distribution of tokens is enough to balance out each nodes token allocation almost evenly. any less and some nodes will get far more unbalanced, as Avi has shown. In 3.0 there is a new token allocating algorithm

Re: Unbalanced cluster

2017-07-10 Thread Avi Kivity
32 tokens is too few for 33 nodes. I have a sharding simulator [1] and it shows $ ./shardsim --vnodes 32 --nodes 33 --shards 1 33 nodes, 32 vnodes, 1 shards maximum node overcommit: 1.42642 maximum shard overcommit: 1.426417 So 40% overcommit over the average. Since some nodes can be

Re: Unbalanced cluster with RandomPartitioner

2012-01-23 Thread aaron morton
Setting a token outside of the partitioner range sounds like a bug. It's mostly an issue with the RP, but I guess a custom partitioner may also want to validate tokens are within a range. Can you report it to https://issues.apache.org/jira/browse/CASSANDRA Thanks - Aaron

Re: Unbalanced cluster with RandomPartitioner

2012-01-21 Thread Marcel Steinbach
I thought about our issue again and was thinking, maybe the describeOwnership should take into account, if a token is outside the partitioners maximum token range? To recap our problem: we had tokens, that were apart by 12.5% of the token range 2**127, however, we had an offset on each token,

Re: Unbalanced cluster with RandomPartitioner

2012-01-20 Thread Marcel Steinbach
On 19.01.2012, at 20:15, Narendra Sharma wrote: I believe you need to move the nodes on the ring. What was the load on the nodes before you added 5 new nodes? Its just that you are getting data in certain token range more than others. With three nodes, it was also imbalanced. What I don't

Re: Unbalanced cluster with RandomPartitioner

2012-01-20 Thread Marcel Steinbach
Thanks for all the responses! I found our problem: Using the Random Partitioner, the key range is from 0..2**127.When we added nodes, we generated the keys and out of convenience, we added an offset to the tokens because the move was easier like that. However, we did not execute the modulo

Re: Unbalanced cluster with RandomPartitioner

2012-01-19 Thread Marcel Steinbach
On 18.01.2012, at 02:19, Maki Watanabe wrote: Are there any significant difference of number of sstables on each nodes? No, no significant difference there. Actually, node 8 is among those with more sstables but with the least load (20GB) On 17.01.2012, at 20:14, Jeremiah Jordan wrote: Are you

Re: Unbalanced cluster with RandomPartitioner

2012-01-19 Thread Marcel Steinbach
2012/1/19 aaron morton aa...@thelastpickle.com: If you have performed any token moves the data will not be deleted until you run nodetool cleanup. We did that after adding nodes to the cluster. And then, the cluster wasn't balanced either. Also, does the Load really account for dead data, or is

Re: Unbalanced cluster with RandomPartitioner

2012-01-19 Thread aaron morton
Load reported from node tool ring is the live load, which means SSTables that the server has open and will read from during a request. This will include tombstones, expired and over written data. nodetool ctstats also includes dead load, which is sstables that are in use but still on disk.

Re: Unbalanced cluster with RandomPartitioner

2012-01-19 Thread Narendra Sharma
I believe you need to move the nodes on the ring. What was the load on the nodes before you added 5 new nodes? Its just that you are getting data in certain token range more than others. -Naren On Thu, Jan 19, 2012 at 3:22 AM, Marcel Steinbach marcel.steinb...@chors.de wrote: On 18.01.2012,

Re: Unbalanced cluster with RandomPartitioner

2012-01-18 Thread aaron morton
If you have performed any token moves the data will not be deleted until you run nodetool cleanup. To get a baseline I would run nodetool compact to do major compaction and purge any tomb stones as others have said. Cheers - Aaron Morton Freelance Developer @aaronmorton

Re: Unbalanced cluster with RandomPartitioner

2012-01-17 Thread Mohit Anchlia
Have you tried running repair first on each node? Also, verify using df -h on the data dirs On Tue, Jan 17, 2012 at 7:34 AM, Marcel Steinbach marcel.steinb...@chors.de wrote: Hi, we're using RP and have each node assigned the same amount of the token space. The cluster looks like that:

Re: Unbalanced cluster with RandomPartitioner

2012-01-17 Thread Marcel Steinbach
We are running regular repairs, so I don't think that's the problem. And the data dir sizes match approx. the load from the nodetool. Thanks for the advise, though. Our keys are digits only, and all contain a few zeros at the same offsets. I'm not that familiar with the md5 algorithm, but I

Re: Unbalanced cluster with RandomPartitioner

2012-01-17 Thread Jeremiah Jordan
Are you deleting data or using TTL's? Expired/deleted data won't go away until the sstable holding it is compacted. So if compaction has happened on some nodes, but not on others, you will see this. The disparity is pretty big 400Gb to 20GB, so this probably isn't the issue, but with our

Re: Unbalanced cluster with RandomPartitioner

2012-01-17 Thread Maki Watanabe
Are there any significant difference of number of sstables on each nodes? 2012/1/18 Marcel Steinbach marcel.steinb...@chors.de: We are running regular repairs, so I don't think that's the problem. And the data dir sizes match approx. the load from the nodetool. Thanks for the advise, though.