[jira] [Comment Edited] (CASSANDRA-12777) Optimize the vnode allocation for single replica per DC

2016-10-23 Thread Dikang Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592202#comment-15592202
 ] 

Dikang Gu edited comment on CASSANDRA-12777 at 10/24/16 3:38 AM:
-

Cool, revert the change to the test, and rebase on trunk,

Thanks!


was (Author: dikanggu):
Cool, revert the change to the test, and rebase on trunk, 
https://github.com/DikangGu/cassandra/commit/fbadd16cf9fb15db2d8afeb261a67313fd25

Thanks!

> Optimize the vnode allocation for single replica per DC
> ---
>
> Key: CASSANDRA-12777
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12777
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Dikang Gu
>Assignee: Dikang Gu
> Fix For: 3.x
>
>
> The new vnode allocation algorithm introduced in CASSANDRA-7032 is optimized 
> for the situation that there are multiple replicas per DC.
> In our production environment, most cluster only has one replica, in this 
> case, the algorithm does not work perfectly. It always tries to split token 
> ranges by half, so that the ownership of "min" node could go as low as ~60% 
> compared to avg.
> So for single replica case, I'm working on a new algorithm, which is based on 
> Branimir's previous commit, to split token ranges by "some" percentage, 
> instead of always by half. In this way, we can get a very small variation of 
> the ownership among different nodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-12777) Optimize the vnode allocation for single replica per DC

2016-10-20 Thread Dikang Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15589335#comment-15589335
 ] 

Dikang Gu edited comment on CASSANDRA-12777 at 10/20/16 3:59 PM:
-

[~blambov] do not find anything wrong obviously, I mark the test to be flaky 
here: 
https://github.com/DikangGu/cassandra/commit/0061e245776cd44affc1adba12fc339c02acc9c9#diff-a465df405c45a540a4f50e4e89a0a38dR38

new commit, .


was (Author: dikanggu):
[~blambov] do not find anything wrong obviously, I mark the test to be flaky 
here: 
https://github.com/DikangGu/cassandra/commit/0061e245776cd44affc1adba12fc339c02acc9c9#diff-a465df405c45a540a4f50e4e89a0a38dR38

new commit, 
https://github.com/DikangGu/cassandra/commit/0061e245776cd44affc1adba12fc339c02acc9c9

> Optimize the vnode allocation for single replica per DC
> ---
>
> Key: CASSANDRA-12777
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12777
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Dikang Gu
>Assignee: Dikang Gu
> Fix For: 3.x
>
>
> The new vnode allocation algorithm introduced in CASSANDRA-7032 is optimized 
> for the situation that there are multiple replicas per DC.
> In our production environment, most cluster only has one replica, in this 
> case, the algorithm does not work perfectly. It always tries to split token 
> ranges by half, so that the ownership of "min" node could go as low as ~60% 
> compared to avg.
> So for single replica case, I'm working on a new algorithm, which is based on 
> Branimir's previous commit, to split token ranges by "some" percentage, 
> instead of always by half. In this way, we can get a very small variation of 
> the ownership among different nodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-12777) Optimize the vnode allocation for single replica per DC

2016-10-19 Thread Dikang Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15586059#comment-15586059
 ] 

Dikang Gu edited comment on CASSANDRA-12777 at 10/19/16 5:34 PM:
-

rebased on trunk.


was (Author: dikanggu):
rebased on trunk, 
https://github.com/DikangGu/cassandra/commit/7e5f664cc9145207a8e0ed348a2710b1a409f3ae

> Optimize the vnode allocation for single replica per DC
> ---
>
> Key: CASSANDRA-12777
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12777
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Dikang Gu
>Assignee: Dikang Gu
> Fix For: 3.x
>
>
> The new vnode allocation algorithm introduced in CASSANDRA-7032 is optimized 
> for the situation that there are multiple replicas per DC.
> In our production environment, most cluster only has one replica, in this 
> case, the algorithm does not work perfectly. It always tries to split token 
> ranges by half, so that the ownership of "min" node could go as low as ~60% 
> compared to avg.
> So for single replica case, I'm working on a new algorithm, which is based on 
> Branimir's previous commit, to split token ranges by "some" percentage, 
> instead of always by half. In this way, we can get a very small variation of 
> the ownership among different nodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-12777) Optimize the vnode allocation for single replica per DC

2016-10-18 Thread Dikang Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15583301#comment-15583301
 ] 

Dikang Gu edited comment on CASSANDRA-12777 at 10/18/16 5:28 PM:
-

Sure, I add the limit of the range of take over ratio, for both MIN and MAX 
ratios. Here is the updated patch,  and the simulation results are here, 
https://gist.github.com/DikangGu/29a6b5ab876ff6979de45118b855622b. I'd like to 
go with 0.90, since it produces better results.

Thanks.


was (Author: dikanggu):
Sure, I add the limit of the range of take over ratio, for both MIN and MAX 
ratios. Here is the updated patch, 
https://github.com/DikangGu/cassandra/commit/5e837747974b5faa9833dc55ac5bd33a8c5e8b31,
 and the simulation results are here, 
https://gist.github.com/DikangGu/29a6b5ab876ff6979de45118b855622b. I'd like to 
go with 0.90, since it produces better results.

Thanks.

> Optimize the vnode allocation for single replica per DC
> ---
>
> Key: CASSANDRA-12777
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12777
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Dikang Gu
>Assignee: Dikang Gu
> Fix For: 3.x
>
>
> The new vnode allocation algorithm introduced in CASSANDRA-7032 is optimized 
> for the situation that there are multiple replicas per DC.
> In our production environment, most cluster only has one replica, in this 
> case, the algorithm does not work perfectly. It always tries to split token 
> ranges by half, so that the ownership of "min" node could go as low as ~60% 
> compared to avg.
> So for single replica case, I'm working on a new algorithm, which is based on 
> Branimir's previous commit, to split token ranges by "some" percentage, 
> instead of always by half. In this way, we can get a very small variation of 
> the ownership among different nodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-12777) Optimize the vnode allocation for single replica per DC

2016-10-17 Thread Dikang Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15577330#comment-15577330
 ] 

Dikang Gu edited comment on CASSANDRA-12777 at 10/17/16 8:10 PM:
-

[~blambov] thanks a lot for the review! I addressed your comments and here is a 
new commit.

1. wraparound calculation, for BigIntegerToken, if it's above maximum, I will 
`mod` the max token. I also add test and validation for it.
2. I put the createTokenInfo in constructor because I need to populate the unit 
info according to the tokens.
3. agree, removed the createTokenInfo
4. I tried different fractions, from 0.50 - 0.99, 0.50 will fail the assertion, 
otherwise, they do not make much difference. 
https://gist.github.com/DikangGu/acd8f568f67b11082443419a8d503b01, I put 0.75 
in this commit.
5. Add the `TokenAllocatorBase`, I keep the factory class because I think it's 
cleaner to keep the factory method there.

Thanks!


was (Author: dikanggu):
[~blambov] thanks a lot for the review! I addressed your comments and here is a 
new commit: 
https://github.com/DikangGu/cassandra/commit/402050e32732e67055935689951a56f92b9be281

1. wraparound calculation, for BigIntegerToken, if it's above maximum, I will 
`mod` the max token. I also add test and validation for it.
2. I put the createTokenInfo in constructor because I need to populate the unit 
info according to the tokens.
3. agree, removed the createTokenInfo
4. I tried different fractions, from 0.50 - 0.99, 0.50 will fail the assertion, 
otherwise, they do not make much difference. 
https://gist.github.com/DikangGu/acd8f568f67b11082443419a8d503b01, I put 0.75 
in this commit.
5. Add the `TokenAllocatorBase`, I keep the factory class because I think it's 
cleaner to keep the factory method there.

Thanks!

> Optimize the vnode allocation for single replica per DC
> ---
>
> Key: CASSANDRA-12777
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12777
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Dikang Gu
>Assignee: Dikang Gu
> Fix For: 3.x
>
>
> The new vnode allocation algorithm introduced in CASSANDRA-7032 is optimized 
> for the situation that there are multiple replicas per DC.
> In our production environment, most cluster only has one replica, in this 
> case, the algorithm does not work perfectly. It always tries to split token 
> ranges by half, so that the ownership of "min" node could go as low as ~60% 
> compared to avg.
> So for single replica case, I'm working on a new algorithm, which is based on 
> Branimir's previous commit, to split token ranges by "some" percentage, 
> instead of always by half. In this way, we can get a very small variation of 
> the ownership among different nodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-12777) Optimize the vnode allocation for single replica per DC

2016-10-14 Thread Dikang Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15572934#comment-15572934
 ] 

Dikang Gu edited comment on CASSANDRA-12777 at 10/15/16 4:21 AM:
-

It's not perfect yet, but I'd like to send it out to let people take a look 
first: 

Currently it supports Murmur3Partitioner and RandomPartitioner. [~blambov] do 
you mind to take a look?

I will continue to clean it up.
Thanks!


was (Author: dikanggu):
It's not perfect yet, but I'd like to send it out to let people take a look 
first: 
https://github.com/DikangGu/cassandra/commit/3711f9c00f47aba768ceba70fd9bd54f1d64

Currently it supports Murmur3Partitioner and RandomPartitioner. [~blambov] do 
you mind to take a look?

I will continue to clean it up.
Thanks!

> Optimize the vnode allocation for single replica per DC
> ---
>
> Key: CASSANDRA-12777
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12777
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Dikang Gu
>Assignee: Dikang Gu
> Fix For: 3.x
>
>
> The new vnode allocation algorithm introduced in CASSANDRA-7032 is optimized 
> for the situation that there are multiple replicas per DC.
> In our production environment, most cluster only has one replica, in this 
> case, the algorithm does not work perfectly. It always tries to split token 
> ranges by half, so that the ownership of "min" node could go as low as ~60% 
> compared to avg.
> So for single replica case, I'm working on a new algorithm, which is based on 
> Branimir's previous commit, to split token ranges by "some" percentage, 
> instead of always by half. In this way, we can get a very small variation of 
> the ownership among different nodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-12777) Optimize the vnode allocation for single replica per DC

2016-10-13 Thread Dikang Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15572934#comment-15572934
 ] 

Dikang Gu edited comment on CASSANDRA-12777 at 10/13/16 9:51 PM:
-

It's not perfect yet, but I'd like to send it out to let people take a look 
first: 
https://github.com/DikangGu/cassandra/commit/3711f9c00f47aba768ceba70fd9bd54f1d64

Currently it supports Murmur3Partitioner and RandomPartitioner. [~blambov] do 
you mind to take a look?

I will continue to clean it up.
Thanks!


was (Author: dikanggu):
It's not perfect yet, but I'd like to send it out to let people take a look 
first: 
https://github.com/DikangGu/cassandra/commit/404e7238dfe6c5147e9681093572aad4e6aa779d

Currently it supports Murmur3Partitioner and RandomPartitioner. [~blambov] do 
you mind to take a look?

I will continue to clean it up.
Thanks!

> Optimize the vnode allocation for single replica per DC
> ---
>
> Key: CASSANDRA-12777
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12777
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Dikang Gu
>Assignee: Dikang Gu
> Fix For: 3.x
>
>
> The new vnode allocation algorithm introduced in CASSANDRA-7032 is optimized 
> for the situation that there are multiple replicas per DC.
> In our production environment, most cluster only has one replica, in this 
> case, the algorithm does not work perfectly. It always tries to split token 
> ranges by half, so that the ownership of "min" node could go as low as ~60% 
> compared to avg.
> So for single replica case, I'm working on a new algorithm, which is based on 
> Branimir's previous commit, to split token ranges by "some" percentage, 
> instead of always by half. In this way, we can get a very small variation of 
> the ownership among different nodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)