Re: Unbalanced cluster with RandomPartitioner

2012-01-23 Thread aaron morton
Setting a token outside of the partitioner range sounds like a bug. It's mostly an issue with the RP, but I guess a custom partitioner may also want to validate tokens are within a range. Can you report it to https://issues.apache.org/jira/browse/CASSANDRA Thanks - Aaron

Re: Unbalanced cluster with RandomPartitioner

2012-01-21 Thread Marcel Steinbach
I thought about our issue again and was thinking, maybe the describeOwnership should take into account, if a token is outside the partitioners maximum token range? To recap our problem: we had tokens, that were apart by 12.5% of the token range 2**127, however, we had an offset on each token,

Re: Unbalanced cluster with RandomPartitioner

2012-01-20 Thread Marcel Steinbach
On 19.01.2012, at 20:15, Narendra Sharma wrote: I believe you need to move the nodes on the ring. What was the load on the nodes before you added 5 new nodes? Its just that you are getting data in certain token range more than others. With three nodes, it was also imbalanced. What I don't

Re: Unbalanced cluster with RandomPartitioner

2012-01-20 Thread Marcel Steinbach
Thanks for all the responses! I found our problem: Using the Random Partitioner, the key range is from 0..2**127.When we added nodes, we generated the keys and out of convenience, we added an offset to the tokens because the move was easier like that. However, we did not execute the modulo

Re: Unbalanced cluster with RandomPartitioner

2012-01-19 Thread Marcel Steinbach
On 18.01.2012, at 02:19, Maki Watanabe wrote: Are there any significant difference of number of sstables on each nodes? No, no significant difference there. Actually, node 8 is among those with more sstables but with the least load (20GB) On 17.01.2012, at 20:14, Jeremiah Jordan wrote: Are you

Re: Unbalanced cluster with RandomPartitioner

2012-01-19 Thread Marcel Steinbach
2012/1/19 aaron morton aa...@thelastpickle.com: If you have performed any token moves the data will not be deleted until you run nodetool cleanup. We did that after adding nodes to the cluster. And then, the cluster wasn't balanced either. Also, does the Load really account for dead data, or is

Re: Unbalanced cluster with RandomPartitioner

2012-01-19 Thread aaron morton
Load reported from node tool ring is the live load, which means SSTables that the server has open and will read from during a request. This will include tombstones, expired and over written data. nodetool ctstats also includes dead load, which is sstables that are in use but still on disk.

Re: Unbalanced cluster with RandomPartitioner

2012-01-19 Thread Narendra Sharma
I believe you need to move the nodes on the ring. What was the load on the nodes before you added 5 new nodes? Its just that you are getting data in certain token range more than others. -Naren On Thu, Jan 19, 2012 at 3:22 AM, Marcel Steinbach marcel.steinb...@chors.de wrote: On 18.01.2012,

Re: Unbalanced cluster with RandomPartitioner

2012-01-18 Thread aaron morton
If you have performed any token moves the data will not be deleted until you run nodetool cleanup. To get a baseline I would run nodetool compact to do major compaction and purge any tomb stones as others have said. Cheers - Aaron Morton Freelance Developer @aaronmorton

Unbalanced cluster with RandomPartitioner

2012-01-17 Thread Marcel Steinbach
Hi, we're using RP and have each node assigned the same amount of the token space. The cluster looks like that: Address Status State LoadOwnsToken

Re: Unbalanced cluster with RandomPartitioner

2012-01-17 Thread Mohit Anchlia
Have you tried running repair first on each node? Also, verify using df -h on the data dirs On Tue, Jan 17, 2012 at 7:34 AM, Marcel Steinbach marcel.steinb...@chors.de wrote: Hi, we're using RP and have each node assigned the same amount of the token space. The cluster looks like that:

Re: Unbalanced cluster with RandomPartitioner

2012-01-17 Thread Marcel Steinbach
We are running regular repairs, so I don't think that's the problem. And the data dir sizes match approx. the load from the nodetool. Thanks for the advise, though. Our keys are digits only, and all contain a few zeros at the same offsets. I'm not that familiar with the md5 algorithm, but I

Re: Unbalanced cluster with RandomPartitioner

2012-01-17 Thread Jeremiah Jordan
Are you deleting data or using TTL's? Expired/deleted data won't go away until the sstable holding it is compacted. So if compaction has happened on some nodes, but not on others, you will see this. The disparity is pretty big 400Gb to 20GB, so this probably isn't the issue, but with our

Re: Unbalanced cluster with RandomPartitioner

2012-01-17 Thread Maki Watanabe
Are there any significant difference of number of sstables on each nodes? 2012/1/18 Marcel Steinbach marcel.steinb...@chors.de: We are running regular repairs, so I don't think that's the problem. And the data dir sizes match approx. the load from the nodetool. Thanks for the advise, though.