[ https://issues.apache.org/jira/browse/CASSANDRA-6976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14199253#comment-14199253 ]
Ariel Weisberg edited comment on CASSANDRA-6976 at 11/5/14 10:38 PM: --------------------------------------------------------------------- I made a JMH microbenchmark for StorageProxy.getRestrictedRanges() There is a benchmark for returning the entire ring because the bound is min token -> min token as well as a benchmark for a range scan which is from token("ariel") -> token("jonathan"). There is also a patch necessary to get the JMH test to compile and run. The parameter nodes in the test is assuming that a node has 256 tokens, the number of endpoints is actually 127. I need to add some more code to generate the right number of endpoint addresses. The number of tokens is NODES * 256. To summarize, at 2000 * 256 tokens it takes 67 milliseconds for a global scan and 1.6 milliseconds for the range scan. This was with the ByteOrderedPartitione which was the default when not specifying any config. {quote} [java] Benchmark (nodes) Mode Samples Score Error Units [java] o.a.c.t.m.GetRestrictedRanges.benchmarkGlobal 1 avgt 15 20.424 ± 1.063 us/op [java] o.a.c.t.m.GetRestrictedRanges.benchmarkGlobal 10 avgt 15 149.235 ± 8.565 us/op [java] o.a.c.t.m.GetRestrictedRanges.benchmarkGlobal 50 avgt 15 792.140 ± 66.921 us/op [java] o.a.c.t.m.GetRestrictedRanges.benchmarkGlobal 100 avgt 15 1849.838 ± 78.383 us/op [java] o.a.c.t.m.GetRestrictedRanges.benchmarkGlobal 500 avgt 15 14135.356 ± 1422.645 us/op [java] o.a.c.t.m.GetRestrictedRanges.benchmarkGlobal 1000 avgt 15 29536.792 ± 916.279 us/op [java] o.a.c.t.m.GetRestrictedRanges.benchmarkGlobal 2000 avgt 15 67073.112 ± 3871.003 us/op [java] o.a.c.t.m.GetRestrictedRanges.benchmarkRange 1 avgt 15 5.586 ± 0.489 us/op [java] o.a.c.t.m.GetRestrictedRanges.benchmarkRange 10 avgt 15 12.877 ± 0.528 us/op [java] o.a.c.t.m.GetRestrictedRanges.benchmarkRange 50 avgt 15 34.626 ± 2.661 us/op [java] o.a.c.t.m.GetRestrictedRanges.benchmarkRange 100 avgt 15 66.194 ± 7.557 us/op [java] o.a.c.t.m.GetRestrictedRanges.benchmarkRange 500 avgt 15 323.378 ± 5.684 us/op [java] o.a.c.t.m.GetRestrictedRanges.benchmarkRange 1000 avgt 15 721.757 ± 74.416 us/op [java] o.a.c.t.m.GetRestrictedRanges.benchmarkRange 2000 avgt 15 1652.384 ± 149.049 us/op {quote} was (Author: aweisberg): I made a JMH microbenchmark for StorageProxy.getRestrictedRanges() There is a benchmark for returning the entire ring because the bound is min token -> min token as well as a benchmark for a range scan which is from token("ariel") -> token("jonathan"). There is also a patch necessary to get the JMH test to compile and run. The parameter nodes in the test is assuming that a node has 256 tokens, the number of endpoints is actually 127. I need to add some more code to generate the right number of endpoint addresses. The number of tokens is NODES * 256. {quote} [java] Benchmark (nodes) Mode Samples Score Error Units [java] o.a.c.t.m.GetRestrictedRanges.benchmarkGlobal 1 avgt 15 20.424 ± 1.063 us/op [java] o.a.c.t.m.GetRestrictedRanges.benchmarkGlobal 10 avgt 15 149.235 ± 8.565 us/op [java] o.a.c.t.m.GetRestrictedRanges.benchmarkGlobal 50 avgt 15 792.140 ± 66.921 us/op [java] o.a.c.t.m.GetRestrictedRanges.benchmarkGlobal 100 avgt 15 1849.838 ± 78.383 us/op [java] o.a.c.t.m.GetRestrictedRanges.benchmarkGlobal 500 avgt 15 14135.356 ± 1422.645 us/op [java] o.a.c.t.m.GetRestrictedRanges.benchmarkGlobal 1000 avgt 15 29536.792 ± 916.279 us/op [java] o.a.c.t.m.GetRestrictedRanges.benchmarkGlobal 2000 avgt 15 67073.112 ± 3871.003 us/op [java] o.a.c.t.m.GetRestrictedRanges.benchmarkRange 1 avgt 15 5.586 ± 0.489 us/op [java] o.a.c.t.m.GetRestrictedRanges.benchmarkRange 10 avgt 15 12.877 ± 0.528 us/op [java] o.a.c.t.m.GetRestrictedRanges.benchmarkRange 50 avgt 15 34.626 ± 2.661 us/op [java] o.a.c.t.m.GetRestrictedRanges.benchmarkRange 100 avgt 15 66.194 ± 7.557 us/op [java] o.a.c.t.m.GetRestrictedRanges.benchmarkRange 500 avgt 15 323.378 ± 5.684 us/op [java] o.a.c.t.m.GetRestrictedRanges.benchmarkRange 1000 avgt 15 721.757 ± 74.416 us/op [java] o.a.c.t.m.GetRestrictedRanges.benchmarkRange 2000 avgt 15 1652.384 ± 149.049 us/op {quote} > Determining replicas to query is very slow with large numbers of nodes or > vnodes > -------------------------------------------------------------------------------- > > Key: CASSANDRA-6976 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6976 > Project: Cassandra > Issue Type: Bug > Components: Core > Reporter: Benedict > Assignee: Ariel Weisberg > Labels: performance > Fix For: 2.1.2 > > Attachments: GetRestrictedRanges.java, jmh_output.txt, > make_jmh_work.patch > > > As described in CASSANDRA-6906, this can be ~100ms for a relatively small > cluster with vnodes, which is longer than it will spend in transit on the > network. This should be much faster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)