Awesome utility Avi! Thanks for sharing.
On Tue, Jul 11, 2017 at 10:57 AM Avi Kivity wrote:
> There is now a readme with some examples and a build file.
>
> On 07/11/2017 11:53 AM, Avi Kivity wrote:
>
> Yeah, posting a github link carries an implied undertaking to write a
> README file and make i
There is now a readme with some examples and a build file.
On 07/11/2017 11:53 AM, Avi Kivity wrote:
Yeah, posting a github link carries an implied undertaking to write a
README file and make it easily buildable. I'll see what I can do.
On 07/11/2017 06:25 AM, Nate McCall wrote:
You wou
Yeah, posting a github link carries an implied undertaking to write a
README file and make it easily buildable. I'll see what I can do.
On 07/11/2017 06:25 AM, Nate McCall wrote:
You wouldnt have a build file laying around for that, would you?
On Tue, Jul 11, 2017 at 3:23 PM, Nate McCall
It is ScyllaDB specific. Scylla divides data not only among nodes, but
also internally within a node among cores (=shards in our terminology).
In the past we had problems with shards being over- and under-utilized
(just like your cluster), so this simulator was developed to validate
the solutio
Thanks for the hint and tool !
By the way, what does the --shards parameter means ?
Thanks
Loic
On 07/10/2017 05:20 PM, Avi Kivity wrote:
> 32 tokens is too few for 33 nodes. I have a sharding simulator [1] and
> it shows
>
>
> $ ./shardsim --vnodes 32 --nodes 33 --shards 1
> 33 nodes, 32 vno
You wouldnt have a build file laying around for that, would you?
On Tue, Jul 11, 2017 at 3:23 PM, Nate McCall wrote:
> On Tue, Jul 11, 2017 at 3:20 AM, Avi Kivity wrote:
>
>>
>>
>>
>> [1] https://github.com/avikivity/shardsim
>>
>
> Avi, that's super handy - thanks for posting.
>
--
---
On Tue, Jul 11, 2017 at 3:20 AM, Avi Kivity wrote:
>
>
>
> [1] https://github.com/avikivity/shardsim
>
Avi, that's super handy - thanks for posting.
the reason for the default of 256 vnodes is because at that many tokens the
random distribution of tokens is enough to balance out each nodes token
allocation almost evenly. any less and some nodes will get far more
unbalanced, as Avi has shown. In 3.0 there is a new token allocating
algorithm howe
32 tokens is too few for 33 nodes. I have a sharding simulator [1] and
it shows
$ ./shardsim --vnodes 32 --nodes 33 --shards 1
33 nodes, 32 vnodes, 1 shards
maximum node overcommit: 1.42642
maximum shard overcommit: 1.426417
So 40% overcommit over the average. Since some nodes can be
underc
Setting a token outside of the partitioner range sounds like a bug. It's mostly
an issue with the RP, but I guess a custom partitioner may also want to
validate tokens are within a range.
Can you report it to https://issues.apache.org/jira/browse/CASSANDRA
Thanks
-
Aaron Morto
I thought about our issue again and was thinking, maybe the describeOwnership
should take into account, if a token is outside the partitioners maximum token
range?
To recap our problem: we had tokens, that were apart by 12.5% of the token
range 2**127, however, we had an offset on each token, w
Thanks for all the responses!
I found our problem:
Using the Random Partitioner, the key range is from 0..2**127.When we added
nodes, we generated the keys and out of convenience, we added an offset to the
tokens because the move was easier like that.
However, we did not execute the modulo 2**1
On 19.01.2012, at 20:15, Narendra Sharma wrote:
> I believe you need to move the nodes on the ring. What was the load on the
> nodes before you added 5 new nodes? Its just that you are getting data in
> certain token range more than others.
With three nodes, it was also imbalanced.
What I don't
I believe you need to move the nodes on the ring. What was the load on the
nodes before you added 5 new nodes? Its just that you are getting data in
certain token range more than others.
-Naren
On Thu, Jan 19, 2012 at 3:22 AM, Marcel Steinbach wrote:
> On 18.01.2012, at 02:19, Maki Watanabe wro
Load reported from node tool ring is the live load, which means SSTables that
the server has open and will read from during a request. This will include
tombstones, expired and over written data.
nodetool ctstats also includes "dead" load, which is sstables that are in use
but still on disk.
2012/1/19 aaron morton :
> If you have performed any token moves the data will not be deleted until you
> run nodetool cleanup.
We did that after adding nodes to the cluster. And then, the cluster
wasn't balanced either.
Also, does the "Load" really account for "dead" data, or is it just live data?
On 18.01.2012, at 02:19, Maki Watanabe wrote:
> Are there any significant difference of number of sstables on each nodes?
No, no significant difference there. Actually, node 8 is among those with more
sstables but with the least load (20GB)
On 17.01.2012, at 20:14, Jeremiah Jordan wrote:
> Are yo
If you have performed any token moves the data will not be deleted until you
run nodetool cleanup.
To get a baseline I would run nodetool compact to do major compaction and purge
any tomb stones as others have said.
Cheers
-
Aaron Morton
Freelance Developer
@aaronmorton
http:
Are there any significant difference of number of sstables on each nodes?
2012/1/18 Marcel Steinbach :
> We are running regular repairs, so I don't think that's the problem.
> And the data dir sizes match approx. the load from the nodetool.
> Thanks for the advise, though.
>
> Our keys are digits
Are you deleting data or using TTL's? Expired/deleted data won't go
away until the sstable holding it is compacted. So if compaction has
happened on some nodes, but not on others, you will see this. The
disparity is pretty big 400Gb to 20GB, so this probably isn't the issue,
but with our dat
We are running regular repairs, so I don't think that's the problem.
And the data dir sizes match approx. the load from the nodetool.
Thanks for the advise, though.
Our keys are digits only, and all contain a few zeros at the same offsets. I'm
not that familiar with the md5 algorithm, but I doub
Have you tried running repair first on each node? Also, verify using
df -h on the data dirs
On Tue, Jan 17, 2012 at 7:34 AM, Marcel Steinbach
wrote:
> Hi,
>
> we're using RP and have each node assigned the same amount of the token
> space. The cluster looks like that:
>
> Address Status
22 matches
Mail list logo