[
https://issues.apache.org/jira/browse/CASSANDRA-12777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15574714#comment-15574714
]
Branimir Lambov commented on CASSANDRA-12777:
---------------------------------------------
"Percent" implies 0-100 which isn't what you are using (rightly so). "Ratio" is
a better term for 0-1 multiplier.
How do you handle result above maximum in [the wraparound
calculation|https://github.com/DikangGu/cassandra/commit/3711f9c00f47ddddaba768ceba70fd9bd54f1d64#diff-fbd2bf80d0bb22ddfaac9973e58558f4R100]?
For {{LongToken}} that's not a problem as the token gets wrapped on conversion
to long, but I don't think that happens for {{BigIntegerToken}}. This needs a
test as well (including a random one using {{a.size(split(a, b, x))}} within a
couple of ulps from {{x * a.size(b)}}, also including a validity check for the
returned token).
[{{createTokenInfo}} in
constructor|https://github.com/DikangGu/cassandra/commit/3711f9c00f47ddddaba768ceba70fd9bd54f1d64#diff-3ee50470ce492c51246b32b35fae5cfcR51]
appears superfluous.
Most of the comments in the original implementation add important information
and should be preserved (e.g. why [the +
2|https://github.com/DikangGu/cassandra/commit/3711f9c00f47ddddaba768ceba70fd9bd54f1d64#diff-3ee50470ce492c51246b32b35fae5cfcR155]).
Did you try lower fractions than 0.99 for takeovers? I would go lower, perhaps
0.9 or even 0.75 (try the simulation out).
Nit: From a design perspective I believe it would be cleaner to leave the
{{TokenAllocator}} interface as interface, put the factory method there, and
move the abstract base class to a {{TokenAllocatorBase}}.
> Optimize the vnode allocation for single replica per DC
> -------------------------------------------------------
>
> Key: CASSANDRA-12777
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12777
> Project: Cassandra
> Issue Type: Improvement
> Reporter: Dikang Gu
> Assignee: Dikang Gu
> Fix For: 3.x
>
>
> The new vnode allocation algorithm introduced in CASSANDRA-7032 is optimized
> for the situation that there are multiple replicas per DC.
> In our production environment, most cluster only has one replica, in this
> case, the algorithm does not work perfectly. It always tries to split token
> ranges by half, so that the ownership of "min" node could go as low as ~60%
> compared to avg.
> So for single replica case, I'm working on a new algorithm, which is based on
> Branimir's previous commit, to split token ranges by "some" percentage,
> instead of always by half. In this way, we can get a very small variation of
> the ownership among different nodes.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)