[ 
https://issues.apache.org/jira/browse/CASSANDRA-16610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Miklosovic updated CASSANDRA-16610:
------------------------------------------
    Description: 
I implemented partitioner based on XXHash algorithm.

There are two branches, the first xxhash, extracts common parts with Murmur as 
there is a lot of overlap between these two.

The second branch just copies everything from Murmur and changes just bits 
which are necessary.

I am not sure what path we want to go with so I just provided both to easier 
elaborate on.

I have written a microbenchmark measuring both partitioners and XXHash 
implementation is very fast, around 10x faster. Benchmark is included in 
xxhash-2 branch.

https://github.com/instaclustr/cassandra/tree/xxhash-2

https://github.com/instaclustr/cassandra/tree/xxhash

{code:java}
[java] Benchmark                                  (bufferSize)  Mode  Cnt      
Score   Error  Units
[java] PartitionersBench.benchMurmur3Partitioner            31  avgt   20    
157.942 ± 0.110  ns/op
[java] PartitionersBench.benchMurmur3Partitioner            67  avgt   20    
204.670 ± 0.152  ns/op
[java] PartitionersBench.benchMurmur3Partitioner           131  avgt   20    
361.068 ± 0.228  ns/op
[java] PartitionersBench.benchMurmur3Partitioner           517  avgt   20   
1325.670 ± 1.255  ns/op
[java] PartitionersBench.benchMurmur3Partitioner          1031  avgt   20   
2594.651 ± 2.725  ns/op
[java] PartitionersBench.benchMurmur3Partitioner          2041  avgt   20   
5082.166 ± 1.721  ns/op
[java] PartitionersBench.benchMurmur3Partitioner          4097  avgt   20  
10112.020 ± 3.637  ns/op
[java] PartitionersBench.benchXXHashPartitioner             31  avgt   20     
40.650 ± 0.025  ns/op
[java] PartitionersBench.benchXXHashPartitioner             67  avgt   20     
53.305 ± 0.035  ns/op
[java] PartitionersBench.benchXXHashPartitioner            131  avgt   20     
67.098 ± 0.057  ns/op
[java] PartitionersBench.benchXXHashPartitioner            517  avgt   20    
150.415 ± 0.107  ns/op
[java] PartitionersBench.benchXXHashPartitioner           1031  avgt   20    
265.614 ± 0.140  ns/op
[java] PartitionersBench.benchXXHashPartitioner           2041  avgt   20    
365.796 ± 0.225  ns/op
[java] PartitionersBench.benchXXHashPartitioner           4097  avgt   20    
925.841 ± 0.664  ns/op
{code}

https://github.com/OpenHFT/Zero-Allocation-Hashing
https://cyan4973.github.io/xxHash/






  was:
I implemented partitioner based on XXHash algorithm.

There are two branches, the first xxhash, extracts common parts with Murmur as 
there is a lot of overlap between these two.

The second branch just copies everything from Murmur and changes just bits 
which are necessary.

I have written a microbenchmark measuring both partitioners and XXHash 
implementation is very fast, around 10x faster. Benchmark is included in 
xxhash-2 branch.


{code:java}
[java] Benchmark                                  (bufferSize)  Mode  Cnt      
Score   Error  Units
[java] PartitionersBench.benchMurmur3Partitioner            31  avgt   20    
157.942 ± 0.110  ns/op
[java] PartitionersBench.benchMurmur3Partitioner            67  avgt   20    
204.670 ± 0.152  ns/op
[java] PartitionersBench.benchMurmur3Partitioner           131  avgt   20    
361.068 ± 0.228  ns/op
[java] PartitionersBench.benchMurmur3Partitioner           517  avgt   20   
1325.670 ± 1.255  ns/op
[java] PartitionersBench.benchMurmur3Partitioner          1031  avgt   20   
2594.651 ± 2.725  ns/op
[java] PartitionersBench.benchMurmur3Partitioner          2041  avgt   20   
5082.166 ± 1.721  ns/op
[java] PartitionersBench.benchMurmur3Partitioner          4097  avgt   20  
10112.020 ± 3.637  ns/op
[java] PartitionersBench.benchXXHashPartitioner             31  avgt   20     
40.650 ± 0.025  ns/op
[java] PartitionersBench.benchXXHashPartitioner             67  avgt   20     
53.305 ± 0.035  ns/op
[java] PartitionersBench.benchXXHashPartitioner            131  avgt   20     
67.098 ± 0.057  ns/op
[java] PartitionersBench.benchXXHashPartitioner            517  avgt   20    
150.415 ± 0.107  ns/op
[java] PartitionersBench.benchXXHashPartitioner           1031  avgt   20    
265.614 ± 0.140  ns/op
[java] PartitionersBench.benchXXHashPartitioner           2041  avgt   20    
365.796 ± 0.225  ns/op
[java] PartitionersBench.benchXXHashPartitioner           4097  avgt   20    
925.841 ± 0.664  ns/op
{code}

https://github.com/OpenHFT/Zero-Allocation-Hashing
https://cyan4973.github.io/xxHash/







> Implement XXHashPartitioner
> ---------------------------
>
>                 Key: CASSANDRA-16610
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-16610
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Legacy/Core
>            Reporter: Stefan Miklosovic
>            Priority: Normal
>
> I implemented partitioner based on XXHash algorithm.
> There are two branches, the first xxhash, extracts common parts with Murmur 
> as there is a lot of overlap between these two.
> The second branch just copies everything from Murmur and changes just bits 
> which are necessary.
> I am not sure what path we want to go with so I just provided both to easier 
> elaborate on.
> I have written a microbenchmark measuring both partitioners and XXHash 
> implementation is very fast, around 10x faster. Benchmark is included in 
> xxhash-2 branch.
> https://github.com/instaclustr/cassandra/tree/xxhash-2
> https://github.com/instaclustr/cassandra/tree/xxhash
> {code:java}
> [java] Benchmark                                  (bufferSize)  Mode  Cnt     
>  Score   Error  Units
> [java] PartitionersBench.benchMurmur3Partitioner            31  avgt   20    
> 157.942 ± 0.110  ns/op
> [java] PartitionersBench.benchMurmur3Partitioner            67  avgt   20    
> 204.670 ± 0.152  ns/op
> [java] PartitionersBench.benchMurmur3Partitioner           131  avgt   20    
> 361.068 ± 0.228  ns/op
> [java] PartitionersBench.benchMurmur3Partitioner           517  avgt   20   
> 1325.670 ± 1.255  ns/op
> [java] PartitionersBench.benchMurmur3Partitioner          1031  avgt   20   
> 2594.651 ± 2.725  ns/op
> [java] PartitionersBench.benchMurmur3Partitioner          2041  avgt   20   
> 5082.166 ± 1.721  ns/op
> [java] PartitionersBench.benchMurmur3Partitioner          4097  avgt   20  
> 10112.020 ± 3.637  ns/op
> [java] PartitionersBench.benchXXHashPartitioner             31  avgt   20     
> 40.650 ± 0.025  ns/op
> [java] PartitionersBench.benchXXHashPartitioner             67  avgt   20     
> 53.305 ± 0.035  ns/op
> [java] PartitionersBench.benchXXHashPartitioner            131  avgt   20     
> 67.098 ± 0.057  ns/op
> [java] PartitionersBench.benchXXHashPartitioner            517  avgt   20    
> 150.415 ± 0.107  ns/op
> [java] PartitionersBench.benchXXHashPartitioner           1031  avgt   20    
> 265.614 ± 0.140  ns/op
> [java] PartitionersBench.benchXXHashPartitioner           2041  avgt   20    
> 365.796 ± 0.225  ns/op
> [java] PartitionersBench.benchXXHashPartitioner           4097  avgt   20    
> 925.841 ± 0.664  ns/op
> {code}
> https://github.com/OpenHFT/Zero-Allocation-Hashing
> https://cyan4973.github.io/xxHash/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to