[jira] [Comment Edited] (CASSANDRA-12777) Optimize the vnode allocation for single replica per DC
[ https://issues.apache.org/jira/browse/CASSANDRA-12777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592202#comment-15592202 ] Dikang Gu edited comment on CASSANDRA-12777 at 10/24/16 3:38 AM: - Cool, revert the change to the test, and rebase on trunk, Thanks! was (Author: dikanggu): Cool, revert the change to the test, and rebase on trunk, https://github.com/DikangGu/cassandra/commit/fbadd16cf9fb15db2d8afeb261a67313fd25 Thanks! > Optimize the vnode allocation for single replica per DC > --- > > Key: CASSANDRA-12777 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12777 > Project: Cassandra > Issue Type: Improvement >Reporter: Dikang Gu >Assignee: Dikang Gu > Fix For: 3.x > > > The new vnode allocation algorithm introduced in CASSANDRA-7032 is optimized > for the situation that there are multiple replicas per DC. > In our production environment, most cluster only has one replica, in this > case, the algorithm does not work perfectly. It always tries to split token > ranges by half, so that the ownership of "min" node could go as low as ~60% > compared to avg. > So for single replica case, I'm working on a new algorithm, which is based on > Branimir's previous commit, to split token ranges by "some" percentage, > instead of always by half. In this way, we can get a very small variation of > the ownership among different nodes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-12777) Optimize the vnode allocation for single replica per DC
[ https://issues.apache.org/jira/browse/CASSANDRA-12777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15589335#comment-15589335 ] Dikang Gu edited comment on CASSANDRA-12777 at 10/20/16 3:59 PM: - [~blambov] do not find anything wrong obviously, I mark the test to be flaky here: https://github.com/DikangGu/cassandra/commit/0061e245776cd44affc1adba12fc339c02acc9c9#diff-a465df405c45a540a4f50e4e89a0a38dR38 new commit, . was (Author: dikanggu): [~blambov] do not find anything wrong obviously, I mark the test to be flaky here: https://github.com/DikangGu/cassandra/commit/0061e245776cd44affc1adba12fc339c02acc9c9#diff-a465df405c45a540a4f50e4e89a0a38dR38 new commit, https://github.com/DikangGu/cassandra/commit/0061e245776cd44affc1adba12fc339c02acc9c9 > Optimize the vnode allocation for single replica per DC > --- > > Key: CASSANDRA-12777 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12777 > Project: Cassandra > Issue Type: Improvement >Reporter: Dikang Gu >Assignee: Dikang Gu > Fix For: 3.x > > > The new vnode allocation algorithm introduced in CASSANDRA-7032 is optimized > for the situation that there are multiple replicas per DC. > In our production environment, most cluster only has one replica, in this > case, the algorithm does not work perfectly. It always tries to split token > ranges by half, so that the ownership of "min" node could go as low as ~60% > compared to avg. > So for single replica case, I'm working on a new algorithm, which is based on > Branimir's previous commit, to split token ranges by "some" percentage, > instead of always by half. In this way, we can get a very small variation of > the ownership among different nodes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-12777) Optimize the vnode allocation for single replica per DC
[ https://issues.apache.org/jira/browse/CASSANDRA-12777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15586059#comment-15586059 ] Dikang Gu edited comment on CASSANDRA-12777 at 10/19/16 5:34 PM: - rebased on trunk. was (Author: dikanggu): rebased on trunk, https://github.com/DikangGu/cassandra/commit/7e5f664cc9145207a8e0ed348a2710b1a409f3ae > Optimize the vnode allocation for single replica per DC > --- > > Key: CASSANDRA-12777 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12777 > Project: Cassandra > Issue Type: Improvement >Reporter: Dikang Gu >Assignee: Dikang Gu > Fix For: 3.x > > > The new vnode allocation algorithm introduced in CASSANDRA-7032 is optimized > for the situation that there are multiple replicas per DC. > In our production environment, most cluster only has one replica, in this > case, the algorithm does not work perfectly. It always tries to split token > ranges by half, so that the ownership of "min" node could go as low as ~60% > compared to avg. > So for single replica case, I'm working on a new algorithm, which is based on > Branimir's previous commit, to split token ranges by "some" percentage, > instead of always by half. In this way, we can get a very small variation of > the ownership among different nodes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-12777) Optimize the vnode allocation for single replica per DC
[ https://issues.apache.org/jira/browse/CASSANDRA-12777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15583301#comment-15583301 ] Dikang Gu edited comment on CASSANDRA-12777 at 10/18/16 5:28 PM: - Sure, I add the limit of the range of take over ratio, for both MIN and MAX ratios. Here is the updated patch, and the simulation results are here, https://gist.github.com/DikangGu/29a6b5ab876ff6979de45118b855622b. I'd like to go with 0.90, since it produces better results. Thanks. was (Author: dikanggu): Sure, I add the limit of the range of take over ratio, for both MIN and MAX ratios. Here is the updated patch, https://github.com/DikangGu/cassandra/commit/5e837747974b5faa9833dc55ac5bd33a8c5e8b31, and the simulation results are here, https://gist.github.com/DikangGu/29a6b5ab876ff6979de45118b855622b. I'd like to go with 0.90, since it produces better results. Thanks. > Optimize the vnode allocation for single replica per DC > --- > > Key: CASSANDRA-12777 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12777 > Project: Cassandra > Issue Type: Improvement >Reporter: Dikang Gu >Assignee: Dikang Gu > Fix For: 3.x > > > The new vnode allocation algorithm introduced in CASSANDRA-7032 is optimized > for the situation that there are multiple replicas per DC. > In our production environment, most cluster only has one replica, in this > case, the algorithm does not work perfectly. It always tries to split token > ranges by half, so that the ownership of "min" node could go as low as ~60% > compared to avg. > So for single replica case, I'm working on a new algorithm, which is based on > Branimir's previous commit, to split token ranges by "some" percentage, > instead of always by half. In this way, we can get a very small variation of > the ownership among different nodes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-12777) Optimize the vnode allocation for single replica per DC
[ https://issues.apache.org/jira/browse/CASSANDRA-12777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15577330#comment-15577330 ] Dikang Gu edited comment on CASSANDRA-12777 at 10/17/16 8:10 PM: - [~blambov] thanks a lot for the review! I addressed your comments and here is a new commit. 1. wraparound calculation, for BigIntegerToken, if it's above maximum, I will `mod` the max token. I also add test and validation for it. 2. I put the createTokenInfo in constructor because I need to populate the unit info according to the tokens. 3. agree, removed the createTokenInfo 4. I tried different fractions, from 0.50 - 0.99, 0.50 will fail the assertion, otherwise, they do not make much difference. https://gist.github.com/DikangGu/acd8f568f67b11082443419a8d503b01, I put 0.75 in this commit. 5. Add the `TokenAllocatorBase`, I keep the factory class because I think it's cleaner to keep the factory method there. Thanks! was (Author: dikanggu): [~blambov] thanks a lot for the review! I addressed your comments and here is a new commit: https://github.com/DikangGu/cassandra/commit/402050e32732e67055935689951a56f92b9be281 1. wraparound calculation, for BigIntegerToken, if it's above maximum, I will `mod` the max token. I also add test and validation for it. 2. I put the createTokenInfo in constructor because I need to populate the unit info according to the tokens. 3. agree, removed the createTokenInfo 4. I tried different fractions, from 0.50 - 0.99, 0.50 will fail the assertion, otherwise, they do not make much difference. https://gist.github.com/DikangGu/acd8f568f67b11082443419a8d503b01, I put 0.75 in this commit. 5. Add the `TokenAllocatorBase`, I keep the factory class because I think it's cleaner to keep the factory method there. Thanks! > Optimize the vnode allocation for single replica per DC > --- > > Key: CASSANDRA-12777 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12777 > Project: Cassandra > Issue Type: Improvement >Reporter: Dikang Gu >Assignee: Dikang Gu > Fix For: 3.x > > > The new vnode allocation algorithm introduced in CASSANDRA-7032 is optimized > for the situation that there are multiple replicas per DC. > In our production environment, most cluster only has one replica, in this > case, the algorithm does not work perfectly. It always tries to split token > ranges by half, so that the ownership of "min" node could go as low as ~60% > compared to avg. > So for single replica case, I'm working on a new algorithm, which is based on > Branimir's previous commit, to split token ranges by "some" percentage, > instead of always by half. In this way, we can get a very small variation of > the ownership among different nodes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-12777) Optimize the vnode allocation for single replica per DC
[ https://issues.apache.org/jira/browse/CASSANDRA-12777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15572934#comment-15572934 ] Dikang Gu edited comment on CASSANDRA-12777 at 10/15/16 4:21 AM: - It's not perfect yet, but I'd like to send it out to let people take a look first: Currently it supports Murmur3Partitioner and RandomPartitioner. [~blambov] do you mind to take a look? I will continue to clean it up. Thanks! was (Author: dikanggu): It's not perfect yet, but I'd like to send it out to let people take a look first: https://github.com/DikangGu/cassandra/commit/3711f9c00f47aba768ceba70fd9bd54f1d64 Currently it supports Murmur3Partitioner and RandomPartitioner. [~blambov] do you mind to take a look? I will continue to clean it up. Thanks! > Optimize the vnode allocation for single replica per DC > --- > > Key: CASSANDRA-12777 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12777 > Project: Cassandra > Issue Type: Improvement >Reporter: Dikang Gu >Assignee: Dikang Gu > Fix For: 3.x > > > The new vnode allocation algorithm introduced in CASSANDRA-7032 is optimized > for the situation that there are multiple replicas per DC. > In our production environment, most cluster only has one replica, in this > case, the algorithm does not work perfectly. It always tries to split token > ranges by half, so that the ownership of "min" node could go as low as ~60% > compared to avg. > So for single replica case, I'm working on a new algorithm, which is based on > Branimir's previous commit, to split token ranges by "some" percentage, > instead of always by half. In this way, we can get a very small variation of > the ownership among different nodes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-12777) Optimize the vnode allocation for single replica per DC
[ https://issues.apache.org/jira/browse/CASSANDRA-12777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15572934#comment-15572934 ] Dikang Gu edited comment on CASSANDRA-12777 at 10/13/16 9:51 PM: - It's not perfect yet, but I'd like to send it out to let people take a look first: https://github.com/DikangGu/cassandra/commit/3711f9c00f47aba768ceba70fd9bd54f1d64 Currently it supports Murmur3Partitioner and RandomPartitioner. [~blambov] do you mind to take a look? I will continue to clean it up. Thanks! was (Author: dikanggu): It's not perfect yet, but I'd like to send it out to let people take a look first: https://github.com/DikangGu/cassandra/commit/404e7238dfe6c5147e9681093572aad4e6aa779d Currently it supports Murmur3Partitioner and RandomPartitioner. [~blambov] do you mind to take a look? I will continue to clean it up. Thanks! > Optimize the vnode allocation for single replica per DC > --- > > Key: CASSANDRA-12777 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12777 > Project: Cassandra > Issue Type: Improvement >Reporter: Dikang Gu >Assignee: Dikang Gu > Fix For: 3.x > > > The new vnode allocation algorithm introduced in CASSANDRA-7032 is optimized > for the situation that there are multiple replicas per DC. > In our production environment, most cluster only has one replica, in this > case, the algorithm does not work perfectly. It always tries to split token > ranges by half, so that the ownership of "min" node could go as low as ~60% > compared to avg. > So for single replica case, I'm working on a new algorithm, which is based on > Branimir's previous commit, to split token ranges by "some" percentage, > instead of always by half. In this way, we can get a very small variation of > the ownership among different nodes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)