[
https://issues.apache.org/jira/browse/CASSANDRA-16610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Stefan Miklosovic updated CASSANDRA-16610:
------------------------------------------
Description:
I implemented partitioner based on XXHash algorithm.
There are two branches, the first xxhash, extracts common parts with Murmur as
there is a lot of overlap between these two.
The second branch just copies everything from Murmur and changes just bits
which are necessary.
I am not sure what path we want to go with so I just provided both to easier
elaborate on.
I have written a microbenchmark measuring both partitioners and XXHash
implementation is very fast, around 10x faster (on greater payloads). Benchmark
is included in xxhash-2 branch.
https://github.com/instaclustr/cassandra/tree/xxhash-2
https://github.com/instaclustr/cassandra/tree/xxhash
{code:java}
[java] Benchmark (bufferSize) Mode Cnt
Score Error Units
[java] PartitionersBench.benchMurmur3Partitioner 31 avgt 20
157.942 ± 0.110 ns/op
[java] PartitionersBench.benchMurmur3Partitioner 67 avgt 20
204.670 ± 0.152 ns/op
[java] PartitionersBench.benchMurmur3Partitioner 131 avgt 20
361.068 ± 0.228 ns/op
[java] PartitionersBench.benchMurmur3Partitioner 517 avgt 20
1325.670 ± 1.255 ns/op
[java] PartitionersBench.benchMurmur3Partitioner 1031 avgt 20
2594.651 ± 2.725 ns/op
[java] PartitionersBench.benchMurmur3Partitioner 2041 avgt 20
5082.166 ± 1.721 ns/op
[java] PartitionersBench.benchMurmur3Partitioner 4097 avgt 20
10112.020 ± 3.637 ns/op
[java] PartitionersBench.benchXXHashPartitioner 31 avgt 20
40.650 ± 0.025 ns/op
[java] PartitionersBench.benchXXHashPartitioner 67 avgt 20
53.305 ± 0.035 ns/op
[java] PartitionersBench.benchXXHashPartitioner 131 avgt 20
67.098 ± 0.057 ns/op
[java] PartitionersBench.benchXXHashPartitioner 517 avgt 20
150.415 ± 0.107 ns/op
[java] PartitionersBench.benchXXHashPartitioner 1031 avgt 20
265.614 ± 0.140 ns/op
[java] PartitionersBench.benchXXHashPartitioner 2041 avgt 20
365.796 ± 0.225 ns/op
[java] PartitionersBench.benchXXHashPartitioner 4097 avgt 20
925.841 ± 0.664 ns/op
{code}
https://github.com/OpenHFT/Zero-Allocation-Hashing
https://cyan4973.github.io/xxHash/
was:
I implemented partitioner based on XXHash algorithm.
There are two branches, the first xxhash, extracts common parts with Murmur as
there is a lot of overlap between these two.
The second branch just copies everything from Murmur and changes just bits
which are necessary.
I am not sure what path we want to go with so I just provided both to easier
elaborate on.
I have written a microbenchmark measuring both partitioners and XXHash
implementation is very fast, around 10x faster. Benchmark is included in
xxhash-2 branch.
https://github.com/instaclustr/cassandra/tree/xxhash-2
https://github.com/instaclustr/cassandra/tree/xxhash
{code:java}
[java] Benchmark (bufferSize) Mode Cnt
Score Error Units
[java] PartitionersBench.benchMurmur3Partitioner 31 avgt 20
157.942 ± 0.110 ns/op
[java] PartitionersBench.benchMurmur3Partitioner 67 avgt 20
204.670 ± 0.152 ns/op
[java] PartitionersBench.benchMurmur3Partitioner 131 avgt 20
361.068 ± 0.228 ns/op
[java] PartitionersBench.benchMurmur3Partitioner 517 avgt 20
1325.670 ± 1.255 ns/op
[java] PartitionersBench.benchMurmur3Partitioner 1031 avgt 20
2594.651 ± 2.725 ns/op
[java] PartitionersBench.benchMurmur3Partitioner 2041 avgt 20
5082.166 ± 1.721 ns/op
[java] PartitionersBench.benchMurmur3Partitioner 4097 avgt 20
10112.020 ± 3.637 ns/op
[java] PartitionersBench.benchXXHashPartitioner 31 avgt 20
40.650 ± 0.025 ns/op
[java] PartitionersBench.benchXXHashPartitioner 67 avgt 20
53.305 ± 0.035 ns/op
[java] PartitionersBench.benchXXHashPartitioner 131 avgt 20
67.098 ± 0.057 ns/op
[java] PartitionersBench.benchXXHashPartitioner 517 avgt 20
150.415 ± 0.107 ns/op
[java] PartitionersBench.benchXXHashPartitioner 1031 avgt 20
265.614 ± 0.140 ns/op
[java] PartitionersBench.benchXXHashPartitioner 2041 avgt 20
365.796 ± 0.225 ns/op
[java] PartitionersBench.benchXXHashPartitioner 4097 avgt 20
925.841 ± 0.664 ns/op
{code}
https://github.com/OpenHFT/Zero-Allocation-Hashing
https://cyan4973.github.io/xxHash/
> Implement XXHashPartitioner
> ---------------------------
>
> Key: CASSANDRA-16610
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16610
> Project: Cassandra
> Issue Type: New Feature
> Components: Legacy/Core
> Reporter: Stefan Miklosovic
> Priority: Normal
> Attachments: jmh-result.json
>
>
> I implemented partitioner based on XXHash algorithm.
> There are two branches, the first xxhash, extracts common parts with Murmur
> as there is a lot of overlap between these two.
> The second branch just copies everything from Murmur and changes just bits
> which are necessary.
> I am not sure what path we want to go with so I just provided both to easier
> elaborate on.
> I have written a microbenchmark measuring both partitioners and XXHash
> implementation is very fast, around 10x faster (on greater payloads).
> Benchmark is included in xxhash-2 branch.
> https://github.com/instaclustr/cassandra/tree/xxhash-2
> https://github.com/instaclustr/cassandra/tree/xxhash
> {code:java}
> [java] Benchmark (bufferSize) Mode Cnt
> Score Error Units
> [java] PartitionersBench.benchMurmur3Partitioner 31 avgt 20
> 157.942 ± 0.110 ns/op
> [java] PartitionersBench.benchMurmur3Partitioner 67 avgt 20
> 204.670 ± 0.152 ns/op
> [java] PartitionersBench.benchMurmur3Partitioner 131 avgt 20
> 361.068 ± 0.228 ns/op
> [java] PartitionersBench.benchMurmur3Partitioner 517 avgt 20
> 1325.670 ± 1.255 ns/op
> [java] PartitionersBench.benchMurmur3Partitioner 1031 avgt 20
> 2594.651 ± 2.725 ns/op
> [java] PartitionersBench.benchMurmur3Partitioner 2041 avgt 20
> 5082.166 ± 1.721 ns/op
> [java] PartitionersBench.benchMurmur3Partitioner 4097 avgt 20
> 10112.020 ± 3.637 ns/op
> [java] PartitionersBench.benchXXHashPartitioner 31 avgt 20
> 40.650 ± 0.025 ns/op
> [java] PartitionersBench.benchXXHashPartitioner 67 avgt 20
> 53.305 ± 0.035 ns/op
> [java] PartitionersBench.benchXXHashPartitioner 131 avgt 20
> 67.098 ± 0.057 ns/op
> [java] PartitionersBench.benchXXHashPartitioner 517 avgt 20
> 150.415 ± 0.107 ns/op
> [java] PartitionersBench.benchXXHashPartitioner 1031 avgt 20
> 265.614 ± 0.140 ns/op
> [java] PartitionersBench.benchXXHashPartitioner 2041 avgt 20
> 365.796 ± 0.225 ns/op
> [java] PartitionersBench.benchXXHashPartitioner 4097 avgt 20
> 925.841 ± 0.664 ns/op
> {code}
> https://github.com/OpenHFT/Zero-Allocation-Hashing
> https://cyan4973.github.io/xxHash/
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]