[jira] [Commented] (CASSANDRA-12777) Optimize the vnode allocation for single replica per DC

2016-10-25 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15605687#comment-15605687
 ] 

Aleksey Yeschenko commented on CASSANDRA-12777:
---

Committed to 3.X as 
[e2a0d75b024463ad481333bdae826928b55ac589|https://github.com/apache/cassandra/commit/e2a0d75b024463ad481333bdae826928b55ac589]
 and merged with trunk, thanks.

> Optimize the vnode allocation for single replica per DC
> ---
>
> Key: CASSANDRA-12777
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12777
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Dikang Gu
>Assignee: Dikang Gu
> Fix For: 3.10
>
>
> The new vnode allocation algorithm introduced in CASSANDRA-7032 is optimized 
> for the situation that there are multiple replicas per DC.
> In our production environment, most cluster only has one replica, in this 
> case, the algorithm does not work perfectly. It always tries to split token 
> ranges by half, so that the ownership of "min" node could go as low as ~60% 
> compared to avg.
> So for single replica case, I'm working on a new algorithm, which is based on 
> Branimir's previous commit, to split token ranges by "some" percentage, 
> instead of always by half. In this way, we can get a very small variation of 
> the ownership among different nodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12777) Optimize the vnode allocation for single replica per DC

2016-10-23 Thread Dikang Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15600833#comment-15600833
 ] 

Dikang Gu commented on CASSANDRA-12777:
---

rebase on trunk, 
https://github.com/DikangGu/cassandra/commit/65b25a0f0c1511a9a7ab0204dd261af7e414ced9

> Optimize the vnode allocation for single replica per DC
> ---
>
> Key: CASSANDRA-12777
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12777
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Dikang Gu
>Assignee: Dikang Gu
> Fix For: 3.x
>
>
> The new vnode allocation algorithm introduced in CASSANDRA-7032 is optimized 
> for the situation that there are multiple replicas per DC.
> In our production environment, most cluster only has one replica, in this 
> case, the algorithm does not work perfectly. It always tries to split token 
> ranges by half, so that the ownership of "min" node could go as low as ~60% 
> compared to avg.
> So for single replica case, I'm working on a new algorithm, which is based on 
> Branimir's previous commit, to split token ranges by "some" percentage, 
> instead of always by half. In this way, we can get a very small variation of 
> the ownership among different nodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12777) Optimize the vnode allocation for single replica per DC

2016-10-20 Thread Dikang Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592700#comment-15592700
 ] 

Dikang Gu commented on CASSANDRA-12777:
---

[unittest|http://cassci.datastax.com/job/blambov-DikangGu-12777-testall/]

> Optimize the vnode allocation for single replica per DC
> ---
>
> Key: CASSANDRA-12777
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12777
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Dikang Gu
>Assignee: Dikang Gu
> Fix For: 3.x
>
>
> The new vnode allocation algorithm introduced in CASSANDRA-7032 is optimized 
> for the situation that there are multiple replicas per DC.
> In our production environment, most cluster only has one replica, in this 
> case, the algorithm does not work perfectly. It always tries to split token 
> ranges by half, so that the ownership of "min" node could go as low as ~60% 
> compared to avg.
> So for single replica case, I'm working on a new algorithm, which is based on 
> Branimir's previous commit, to split token ranges by "some" percentage, 
> instead of always by half. In this way, we can get a very small variation of 
> the ownership among different nodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12777) Optimize the vnode allocation for single replica per DC

2016-10-20 Thread Dikang Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592202#comment-15592202
 ] 

Dikang Gu commented on CASSANDRA-12777:
---

Cool, revert the change to the test, and rebase on trunk, 
https://github.com/DikangGu/cassandra/commit/fbadd16cf9fb15db2d8afeb261a67313fd25

Thanks!

> Optimize the vnode allocation for single replica per DC
> ---
>
> Key: CASSANDRA-12777
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12777
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Dikang Gu
>Assignee: Dikang Gu
> Fix For: 3.x
>
>
> The new vnode allocation algorithm introduced in CASSANDRA-7032 is optimized 
> for the situation that there are multiple replicas per DC.
> In our production environment, most cluster only has one replica, in this 
> case, the algorithm does not work perfectly. It always tries to split token 
> ranges by half, so that the ownership of "min" node could go as low as ~60% 
> compared to avg.
> So for single replica case, I'm working on a new algorithm, which is based on 
> Branimir's previous commit, to split token ranges by "some" percentage, 
> instead of always by half. In this way, we can get a very small variation of 
> the ownership among different nodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12777) Optimize the vnode allocation for single replica per DC

2016-10-20 Thread Branimir Lambov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15591131#comment-15591131
 ] 

Branimir Lambov commented on CASSANDRA-12777:
-

I'm sorry, the flake appears to be caused by CASSANDRA-12812. Could you please 
roll back the change to {{RandomReplicationAwareTokenAllocatorTest.java}}?

The patch is otherwise ready to commit.

> Optimize the vnode allocation for single replica per DC
> ---
>
> Key: CASSANDRA-12777
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12777
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Dikang Gu
>Assignee: Dikang Gu
> Fix For: 3.x
>
>
> The new vnode allocation algorithm introduced in CASSANDRA-7032 is optimized 
> for the situation that there are multiple replicas per DC.
> In our production environment, most cluster only has one replica, in this 
> case, the algorithm does not work perfectly. It always tries to split token 
> ranges by half, so that the ownership of "min" node could go as low as ~60% 
> compared to avg.
> So for single replica case, I'm working on a new algorithm, which is based on 
> Branimir's previous commit, to split token ranges by "some" percentage, 
> instead of always by half. In this way, we can get a very small variation of 
> the ownership among different nodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12777) Optimize the vnode allocation for single replica per DC

2016-10-19 Thread Dikang Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15589335#comment-15589335
 ] 

Dikang Gu commented on CASSANDRA-12777:
---

[~blambov] do not find anything wrong obviously, I mark the test to be flaky 
here: 
https://github.com/DikangGu/cassandra/commit/0061e245776cd44affc1adba12fc339c02acc9c9#diff-a465df405c45a540a4f50e4e89a0a38dR38

new commit, 
https://github.com/DikangGu/cassandra/commit/0061e245776cd44affc1adba12fc339c02acc9c9

> Optimize the vnode allocation for single replica per DC
> ---
>
> Key: CASSANDRA-12777
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12777
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Dikang Gu
>Assignee: Dikang Gu
> Fix For: 3.x
>
>
> The new vnode allocation algorithm introduced in CASSANDRA-7032 is optimized 
> for the situation that there are multiple replicas per DC.
> In our production environment, most cluster only has one replica, in this 
> case, the algorithm does not work perfectly. It always tries to split token 
> ranges by half, so that the ownership of "min" node could go as low as ~60% 
> compared to avg.
> So for single replica case, I'm working on a new algorithm, which is based on 
> Branimir's previous commit, to split token ranges by "some" percentage, 
> instead of always by half. In this way, we can get a very small variation of 
> the ownership among different nodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12777) Optimize the vnode allocation for single replica per DC

2016-10-19 Thread Dikang Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15589088#comment-15589088
 ] 

Dikang Gu commented on CASSANDRA-12777:
---

[~blambov], hmm, I will take a look.

> Optimize the vnode allocation for single replica per DC
> ---
>
> Key: CASSANDRA-12777
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12777
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Dikang Gu
>Assignee: Dikang Gu
> Fix For: 3.x
>
>
> The new vnode allocation algorithm introduced in CASSANDRA-7032 is optimized 
> for the situation that there are multiple replicas per DC.
> In our production environment, most cluster only has one replica, in this 
> case, the algorithm does not work perfectly. It always tries to split token 
> ranges by half, so that the ownership of "min" node could go as low as ~60% 
> compared to avg.
> So for single replica case, I'm working on a new algorithm, which is based on 
> Branimir's previous commit, to split token ranges by "some" percentage, 
> instead of always by half. In this way, we can get a very small variation of 
> the ownership among different nodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12777) Optimize the vnode allocation for single replica per DC

2016-10-19 Thread Branimir Lambov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15589002#comment-15589002
 ] 

Branimir Lambov commented on CASSANDRA-12777:
-

Test results: 
[utest|http://cassci.datastax.com/job/blambov-DikangGu-12777-testall/] 
[dtest|http://cassci.datastax.com/job/blambov-DikangGu-12777-dtest/]

There's a {{RandomReplicationAwareTokenAllocatorTest}} failure that doesn't 
look right.

> Optimize the vnode allocation for single replica per DC
> ---
>
> Key: CASSANDRA-12777
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12777
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Dikang Gu
>Assignee: Dikang Gu
> Fix For: 3.x
>
>
> The new vnode allocation algorithm introduced in CASSANDRA-7032 is optimized 
> for the situation that there are multiple replicas per DC.
> In our production environment, most cluster only has one replica, in this 
> case, the algorithm does not work perfectly. It always tries to split token 
> ranges by half, so that the ownership of "min" node could go as low as ~60% 
> compared to avg.
> So for single replica case, I'm working on a new algorithm, which is based on 
> Branimir's previous commit, to split token ranges by "some" percentage, 
> instead of always by half. In this way, we can get a very small variation of 
> the ownership among different nodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12777) Optimize the vnode allocation for single replica per DC

2016-10-18 Thread Branimir Lambov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15584999#comment-15584999
 ] 

Branimir Lambov commented on CASSANDRA-12777:
-

Thanks for taking the time to do this, 90% sounds great to me.

+1

> Optimize the vnode allocation for single replica per DC
> ---
>
> Key: CASSANDRA-12777
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12777
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Dikang Gu
>Assignee: Dikang Gu
> Fix For: 3.x
>
>
> The new vnode allocation algorithm introduced in CASSANDRA-7032 is optimized 
> for the situation that there are multiple replicas per DC.
> In our production environment, most cluster only has one replica, in this 
> case, the algorithm does not work perfectly. It always tries to split token 
> ranges by half, so that the ownership of "min" node could go as low as ~60% 
> compared to avg.
> So for single replica case, I'm working on a new algorithm, which is based on 
> Branimir's previous commit, to split token ranges by "some" percentage, 
> instead of always by half. In this way, we can get a very small variation of 
> the ownership among different nodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12777) Optimize the vnode allocation for single replica per DC

2016-10-17 Thread Dikang Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15583301#comment-15583301
 ] 

Dikang Gu commented on CASSANDRA-12777:
---

Sure, I add the limit of the range of take over ratio, for both MIN and MAX 
ratios. Here is the updated patch, 
https://github.com/DikangGu/cassandra/commit/5e837747974b5faa9833dc55ac5bd33a8c5e8b31,
 and the simulation results are here, 
https://gist.github.com/DikangGu/29a6b5ab876ff6979de45118b855622b. I'd like to 
go with 0.90, since it produces better results.

Thanks.

> Optimize the vnode allocation for single replica per DC
> ---
>
> Key: CASSANDRA-12777
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12777
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Dikang Gu
>Assignee: Dikang Gu
> Fix For: 3.x
>
>
> The new vnode allocation algorithm introduced in CASSANDRA-7032 is optimized 
> for the situation that there are multiple replicas per DC.
> In our production environment, most cluster only has one replica, in this 
> case, the algorithm does not work perfectly. It always tries to split token 
> ranges by half, so that the ownership of "min" node could go as low as ~60% 
> compared to avg.
> So for single replica case, I'm working on a new algorithm, which is based on 
> Branimir's previous commit, to split token ranges by "some" percentage, 
> instead of always by half. In this way, we can get a very small variation of 
> the ownership among different nodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12777) Optimize the vnode allocation for single replica per DC

2016-10-17 Thread Branimir Lambov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15581880#comment-15581880
 ] 

Branimir Lambov commented on CASSANDRA-12777:
-

Thank you for the changes, this looks good to me.

One last thing is bothering me about the takeover situation (and sorry I did 
not get around to writing about it earlier): in greater-than we set to 0.75, 
but [on equal (or close to equal) we let the new take completely 
over|https://github.com/DikangGu/cassandra/commit/402050e32732e67055935689951a56f92b9be281#diff-3ee50470ce492c51246b32b35fae5cfcR198].
 This feels inconsistent and (at least to me) potentially problematic (it may 
also be the reason why we see no effect on the simulation). Could we limit the 
fraction of the range that can be taken over instead?

> Optimize the vnode allocation for single replica per DC
> ---
>
> Key: CASSANDRA-12777
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12777
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Dikang Gu
>Assignee: Dikang Gu
> Fix For: 3.x
>
>
> The new vnode allocation algorithm introduced in CASSANDRA-7032 is optimized 
> for the situation that there are multiple replicas per DC.
> In our production environment, most cluster only has one replica, in this 
> case, the algorithm does not work perfectly. It always tries to split token 
> ranges by half, so that the ownership of "min" node could go as low as ~60% 
> compared to avg.
> So for single replica case, I'm working on a new algorithm, which is based on 
> Branimir's previous commit, to split token ranges by "some" percentage, 
> instead of always by half. In this way, we can get a very small variation of 
> the ownership among different nodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12777) Optimize the vnode allocation for single replica per DC

2016-10-14 Thread Dikang Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15577330#comment-15577330
 ] 

Dikang Gu commented on CASSANDRA-12777:
---

[~blambov] thanks a lot for the review! I addressed your comments and here is a 
new commit: 
https://github.com/DikangGu/cassandra/commit/402050e32732e67055935689951a56f92b9be281

1. wraparound calculation, for BigIntegerToken, if it's above maximum, I will 
`mod` the max token. I also add test and validation for it.
2. I put the createTokenInfo in constructor because I need to populate the unit 
info according to the tokens.
3. agree, removed the createTokenInfo
4. I tried different fractions, from 0.50 - 0.99, 0.50 will fail the assertion, 
otherwise, they do not make much difference. 
https://gist.github.com/DikangGu/acd8f568f67b11082443419a8d503b01, I put 0.75 
in this commit.
5. Add the `TokenAllocatorBase`, I keep the factory class because I think it's 
cleaner to keep the factory method there.

Thanks!

> Optimize the vnode allocation for single replica per DC
> ---
>
> Key: CASSANDRA-12777
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12777
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Dikang Gu
>Assignee: Dikang Gu
> Fix For: 3.x
>
>
> The new vnode allocation algorithm introduced in CASSANDRA-7032 is optimized 
> for the situation that there are multiple replicas per DC.
> In our production environment, most cluster only has one replica, in this 
> case, the algorithm does not work perfectly. It always tries to split token 
> ranges by half, so that the ownership of "min" node could go as low as ~60% 
> compared to avg.
> So for single replica case, I'm working on a new algorithm, which is based on 
> Branimir's previous commit, to split token ranges by "some" percentage, 
> instead of always by half. In this way, we can get a very small variation of 
> the ownership among different nodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12777) Optimize the vnode allocation for single replica per DC

2016-10-14 Thread Branimir Lambov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15574714#comment-15574714
 ] 

Branimir Lambov commented on CASSANDRA-12777:
-

"Percent" implies 0-100 which isn't what you are using (rightly so). "Ratio" is 
a better term for 0-1 multiplier.

How do you handle result above maximum in [the wraparound 
calculation|https://github.com/DikangGu/cassandra/commit/3711f9c00f47aba768ceba70fd9bd54f1d64#diff-fbd2bf80d0bb22ddfaac9973e58558f4R100]?
 For {{LongToken}} that's not a problem as the token gets wrapped on conversion 
to long, but I don't think that happens for {{BigIntegerToken}}. This needs a 
test as well (including a random one using {{a.size(split(a, b, x))}} within a 
couple of ulps from {{x * a.size(b)}}, also including a validity check for the 
returned token).

[{{createTokenInfo}} in 
constructor|https://github.com/DikangGu/cassandra/commit/3711f9c00f47aba768ceba70fd9bd54f1d64#diff-3ee50470ce492c51246b32b35fae5cfcR51]
 appears superfluous.

Most of the comments in the original implementation add important information 
and should be preserved (e.g. why [the + 
2|https://github.com/DikangGu/cassandra/commit/3711f9c00f47aba768ceba70fd9bd54f1d64#diff-3ee50470ce492c51246b32b35fae5cfcR155]).

Did you try lower fractions than 0.99 for takeovers? I would go lower, perhaps 
0.9 or even 0.75 (try the simulation out).

Nit: From a design perspective I believe it would be cleaner to leave the 
{{TokenAllocator}} interface as interface, put the factory method there, and 
move the abstract base class to a {{TokenAllocatorBase}}.

> Optimize the vnode allocation for single replica per DC
> ---
>
> Key: CASSANDRA-12777
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12777
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Dikang Gu
>Assignee: Dikang Gu
> Fix For: 3.x
>
>
> The new vnode allocation algorithm introduced in CASSANDRA-7032 is optimized 
> for the situation that there are multiple replicas per DC.
> In our production environment, most cluster only has one replica, in this 
> case, the algorithm does not work perfectly. It always tries to split token 
> ranges by half, so that the ownership of "min" node could go as low as ~60% 
> compared to avg.
> So for single replica case, I'm working on a new algorithm, which is based on 
> Branimir's previous commit, to split token ranges by "some" percentage, 
> instead of always by half. In this way, we can get a very small variation of 
> the ownership among different nodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12777) Optimize the vnode allocation for single replica per DC

2016-10-12 Thread Dikang Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15567784#comment-15567784
 ] 

Dikang Gu commented on CASSANDRA-12777:
---

I have a draft patch, and there is the sample results:
{code}
4 vnode, 250 nodes, max 1.11 min 0.89 stddev 0.0734
16 vnode, 250 nodes, max 1.04 min 0.97 stddev 0.0179
64 vnode, 250 nodes, max 1.01 min 0.99 stddev 0.0044
{code}

Will clean it a bit and send it out for review.

> Optimize the vnode allocation for single replica per DC
> ---
>
> Key: CASSANDRA-12777
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12777
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Dikang Gu
>Assignee: Dikang Gu
> Fix For: 3.x
>
>
> The new vnode allocation algorithm introduced in CASSANDRA-7032 is optimized 
> for the situation that there are multiple replicas per DC.
> In our production environment, most cluster only has one replica, in this 
> case, the algorithm does work perfectly. It always tries to split token 
> ranges by half, so that the ownership of "min" node could go as low as ~60% 
> compared to avg.
> So for single replica case, I'm working on a new algorithm, which is based on 
> Branimir's previous commit, to split token ranges by "some" percentage, 
> instead of always by half. In this way, we can get a very small variation of 
> the ownership among different nodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)