[jira] [Commented] (CASSANDRA-16524) Upgrading SSL enabled Cassandra cluster from 3.11.10 to 4.0-beta4 failing with javax.net.ssl.SSLException: java.lang.IndexOutOfBoundsException

2021-04-18 Thread Berenguer Blasi (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17324687#comment-17324687
 ] 

Berenguer Blasi commented on CASSANDRA-16524:
-

The root cause has nothing to do with an upgrade or SSL. But it's just a 
generic BB resizing problem we have. So the unit tests should suffice and are 
the right fix imo if I am not missing anything.

> Upgrading SSL enabled Cassandra cluster from 3.11.10 to 4.0-beta4 failing 
> with javax.net.ssl.SSLException: java.lang.IndexOutOfBoundsException
> --
>
> Key: CASSANDRA-16524
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16524
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/Encryption
>Reporter: Alaykumar Barochia
>Assignee: Gianluca Righetto
>Priority: Normal
> Fix For: 4.0, 4.0-beta
>
> Attachments: system.log.ssl-error.txt
>
>
> Hi,
> We have SSL enabled cluster running on Apache Cassandra 3.11.10 and we are 
> trying to upgrade it to 4.0-beta4 as a part of testing.
> Cluster size is 3x3 and deployed on Azure IaaS.
> {noformat}
> [cassandra@cass-521828978-1-1189299202 ~]$ nodetool status
> Datacenter: southcentral
> 
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  Address  Load   Tokens   Owns (effective)  Host ID
>Rack
> UN  10.12.74.31  85.61 KiB  16   32.2% 
> 6db7a7ef-3490-4823-9ff3-c60a32165124  2
> UN  10.12.74.42  263.27 KiB  16   27.6% 
> 7ad99ecf-7c7d-4780-872b-7c68b6b19849  1
> UN  10.12.74.34  85.61 KiB  16   37.8% 
> 41ce16b7-2ab2-44ea-a810-8391f7f3caf2  0
> Datacenter: westus
> ==
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  Address  Load   Tokens   Owns (effective)  Host ID
>Rack
> UN  10.12.90.11  90.63 KiB  16   38.9% 
> 8d4cdb65-ff66-4bcd-8d4b-a4a0e893a728  2
> UN  10.12.90.6   85.61 KiB  16   34.5% 
> 4f8007e9-fa3e-4e99-a9f9-f7bf9625  1
> UN  10.12.89.80  94.1 KiB   16   28.9% 
> 11f86cb0-c86b-440e-848f-b160118f43d5  0
> {noformat}
> We placed a new 4.0-beta4 binary on the first seed node (10.12.74.310) and 
> starting Cassandra.
> It started throwing the below error:
> {noformat}
> ERROR [Messaging-EventLoop-3-11] 2021-03-15 22:10:05,188 
> InboundConnectionInitiator.java:342 - Failed to properly handshake with peer 
> /10.12.74.42:52356. Closing the channel.
> io.netty.handler.codec.DecoderException: javax.net.ssl.SSLException: 
> java.lang.IndexOutOfBoundsException: writerIndex(8560) + 
> minWritableBytes(1977) exceeds maxCapacity(10240): 
> BufferPoolAllocator$Wrapped(ridx: 0, widx: 8560, cap: 10240/10240)
>   at 
> io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:471)
>   at 
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:276)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
>   at 
> io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
>   at 
> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
>   at 
> io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:795)
>   at 
> io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:480)
>   at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:378)
>   at 
> io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
>   at 
> io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
>   at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: javax.net.ssl.SSLException: java.lang.IndexOutOfBoundsException: 
> writerIndex(8560) + minWritableBytes(1977) exceeds maxCapacity(10240): 
> BufferPoolAllocator$Wrapped(ridx: 0, widx: 8560, cap: 10240/10240)
>   at 
> 

[jira] [Updated] (CASSANDRA-16613) ProtocolVersion.V4 is still used in places in the code

2021-04-18 Thread Ekaterina Dimitrova (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ekaterina Dimitrova updated CASSANDRA-16613:

Fix Version/s: 4.0-rc

> ProtocolVersion.V4 is still used in places in the code
> --
>
> Key: CASSANDRA-16613
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16613
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Ekaterina Dimitrova
>Assignee: Ekaterina Dimitrova
>Priority: Normal
> Fix For: 4.0-rc
>
>
> While working on CASSANDRA-16567, [~adelapena] observed that 
> _ProtocolVersion.V4_ is used in _ViewTest_.
> I decided to do a quick grep and observed a list of places where we still 
> refer to V4 and it seems at least in many of the tests that was left not 
> intentionally.
> This ticket is to verify the usage of _ProtocolVersion.V4_ in the codebase 
> and bump it to V5 or  default version, similar to what was done in 
> CASSANDRA-16567, wherever there is a need. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16558) Fix rat check (April 2021)

2021-04-18 Thread Michael Semb Wever (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17324512#comment-17324512
 ] 

Michael Semb Wever commented on CASSANDRA-16558:


bq. doc/source/development/license_compliance.md needs to be 
doc/source/development/license_compliance.rst

Above patches updated accordingly. (license_compliance removed from 2.2 and 3.0 
where docs don't exist)
Screenshot of generated doc attached.

 !Screenshot 2021-04-18 at 15.34.01.png! 

> Fix rat check (April 2021)
> --
>
> Key: CASSANDRA-16558
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16558
> Project: Cassandra
>  Issue Type: Task
>  Components: Build, Packaging
>Reporter: Michael Semb Wever
>Assignee: Michael Semb Wever
>Priority: High
> Fix For: 2.2.x, 3.0.x, 3.11.x, 4.0.x
>
> Attachments: Screenshot 2021-04-18 at 15.34.01.png
>
>
> The rat plugin in build.xml is a mess and not properly catching missing 
> license headers.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16558) Fix rat check (April 2021)

2021-04-18 Thread Michael Semb Wever (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Semb Wever updated CASSANDRA-16558:
---
Status: Patch Available  (was: In Progress)

> Fix rat check (April 2021)
> --
>
> Key: CASSANDRA-16558
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16558
> Project: Cassandra
>  Issue Type: Task
>  Components: Build, Packaging
>Reporter: Michael Semb Wever
>Assignee: Michael Semb Wever
>Priority: High
> Fix For: 2.2.x, 3.0.x, 3.11.x, 4.0.x
>
> Attachments: Screenshot 2021-04-18 at 15.34.01.png
>
>
> The rat plugin in build.xml is a mess and not properly catching missing 
> license headers.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-16558) Fix rat check (April 2021)

2021-04-18 Thread Michael Semb Wever (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17324512#comment-17324512
 ] 

Michael Semb Wever edited comment on CASSANDRA-16558 at 4/18/21, 1:39 PM:
--

bq. doc/source/development/license_compliance.md needs to be 
doc/source/development/license_compliance.rst

Above patches updated accordingly. (license_compliance removed from 2.2 and 3.0 
where docs don't exist)
Screenshot of generated doc attached.

 !Screenshot 2021-04-18 at 15.34.01.png|width=500! 


was (Author: michaelsembwever):
bq. doc/source/development/license_compliance.md needs to be 
doc/source/development/license_compliance.rst

Above patches updated accordingly. (license_compliance removed from 2.2 and 3.0 
where docs don't exist)
Screenshot of generated doc attached.

 !Screenshot 2021-04-18 at 15.34.01.png! 

> Fix rat check (April 2021)
> --
>
> Key: CASSANDRA-16558
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16558
> Project: Cassandra
>  Issue Type: Task
>  Components: Build, Packaging
>Reporter: Michael Semb Wever
>Assignee: Michael Semb Wever
>Priority: High
> Fix For: 2.2.x, 3.0.x, 3.11.x, 4.0.x
>
> Attachments: Screenshot 2021-04-18 at 15.34.01.png
>
>
> The rat plugin in build.xml is a mess and not properly catching missing 
> license headers.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16558) Fix rat check (April 2021)

2021-04-18 Thread Michael Semb Wever (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Semb Wever updated CASSANDRA-16558:
---
Attachment: Screenshot 2021-04-18 at 15.34.01.png

> Fix rat check (April 2021)
> --
>
> Key: CASSANDRA-16558
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16558
> Project: Cassandra
>  Issue Type: Task
>  Components: Build, Packaging
>Reporter: Michael Semb Wever
>Assignee: Michael Semb Wever
>Priority: High
> Fix For: 2.2.x, 3.0.x, 3.11.x, 4.0.x
>
> Attachments: Screenshot 2021-04-18 at 15.34.01.png
>
>
> The rat plugin in build.xml is a mess and not properly catching missing 
> license headers.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15669) LeveledCompactionStrategy compact last level throw an ArrayIndexOutOfBoundsException

2021-04-18 Thread Alexey Zotov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17324468#comment-17324468
 ] 

Alexey Zotov commented on CASSANDRA-15669:
--

Finally, it turned out that writing a test was quite fast and easy, so I raised 
a PR ([https://github.com/apache/cassandra/pull/971]) to illustrate my findings 
better. Please, review and let me know your thoughts.

> LeveledCompactionStrategy compact last level throw an 
> ArrayIndexOutOfBoundsException
> 
>
> Key: CASSANDRA-15669
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15669
> Project: Cassandra
>  Issue Type: Bug
>Reporter: sunhaihong
>Assignee: sunhaihong
>Priority: Normal
> Attachments: cfs_compaction_info.png, error_info.png
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Cassandra will throw an ArrayIndexOutOfBoundsException when compact last 
> level.
> My test is as follows:
>  # Create a table with LeveledCompactionStrategy and its params are 
> 'enabled': 'true', 'fanout_size': '2', 'max_threshold': '32', 
> 'min_threshold': '4', 'sstable_size_in_mb': '2'(fanout_size and 
> sstable_size_in_mb are too small just to make it easier to reproduce the 
> problem);
>  # Insert data into the table by stress;
>  # Cassandra throw an ArrayIndexOutOfBoundsException when compact level9 
> sstables(this level score bigger than 1.001)
> ERROR [CompactionExecutor:4] 2020-03-28 08:59:00,990 CassandraDaemon.java:442 
> - Exception in thread Thread[CompactionExecutor:4,1,main]
>  java.lang.ArrayIndexOutOfBoundsException: 9
>  at 
> org.apache.cassandra.db.compaction.LeveledManifest.getLevel(LeveledManifest.java:814)
>  at 
> org.apache.cassandra.db.compaction.LeveledManifest.getCandidatesFor(LeveledManifest.java:746)
>  at 
> org.apache.cassandra.db.compaction.LeveledManifest.getCompactionCandidates(LeveledManifest.java:398)
>  at 
> org.apache.cassandra.db.compaction.LeveledCompactionStrategy.getNextBackgroundTask(LeveledCompactionStrategy.java:131)
>  at 
> org.apache.cassandra.db.compaction.CompactionStrategyHolder.lambda$getBackgroundTaskSuppliers$0(CompactionStrategyHolder.java:109)
>  at 
> org.apache.cassandra.db.compaction.AbstractStrategyHolder$TaskSupplier.getTask(AbstractStrategyHolder.java:66)
>  at 
> org.apache.cassandra.db.compaction.CompactionStrategyManager.getNextBackgroundTask(CompactionStrategyManager.java:214)
>  at 
> org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:289)
>  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>  at java.util.concurrent.FutureTask.run$$$capture(FutureTask.java:266)
>  at java.util.concurrent.FutureTask.run(FutureTask.java)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>  at java.lang.Thread.run(Thread.java:748)
> I tested it on cassandra version 3.11.3 & 4.0-alpha3. The exception all 
> happened.
> once it triggers, level1- leveln compaction no longer works, level0 is still 
> valid
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-16610) Implement XXHashPartitioner

2021-04-18 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17324441#comment-17324441
 ] 

Stefan Miklosovic edited comment on CASSANDRA-16610 at 4/18/21, 7:43 AM:
-

I dont think it is only about providing new parititioners without giving any 
"added value". Sure it is faster ... so what. Murmur is fast enough for the 
job. It makes sense to make it pluggable (as it is now, by implementing 
IPartitioner) but it makes sense to code partitioners which are physically 
partitioning data every time differently and each partitioner which already 
exists gives you different behaviour so you can model your data as its 
placement around the cluster as you please.

But coding the partitioner which does nothing else, philosophically, as Murmur, 
provides more or less nothing on top of what we have already. It just 
introduces another piece of code to take care of and so on ... 

By the way, when I wanted to code it, I found myself either duplicating a lot 
of code (xxhash-2 branch) or what I did was that I extracted common stuff 
Murmur into a base class and both Murmur and XXHash extended that one. It is 
rather "interesting" that once one wants to code a partitioner which produces / 
hashes same way as Murmur, more or less, there is not anything in place yet 
which would prevent a developer to either duplicate or extract as I did. So 
yes, maybe some overhauling of the code there as Paulo mentioned would help but 
otherwise ... if we never want to introduce any other partitioner similar to 
Murmur, that does not make sense to do anyway.


was (Author: stefan.miklosovic):
I dont think it is only about providing new parititioners without giving any 
"added value". Sure it is faster ... so what. Murmur is fast enough for the 
job. It makes sense to make it pluggable (as it is now, by implementing 
IPartitioner) but it makes sense to code partitioners which are physically 
partitioning data every time differently and each partitioner which already 
exists gives you different behaviour so you can model your data as its 
placement around the cluster as you please.

But coding the partitioner which does nothing else, philosophically, as Murmur, 
provides more or less nothing on top of what we have already. It just 
introduces another piece of code to take care of and so on ... 

> Implement XXHashPartitioner
> ---
>
> Key: CASSANDRA-16610
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16610
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Legacy/Core
>Reporter: Stefan Miklosovic
>Priority: Normal
> Attachments: jmh-result.json
>
>
> I implemented partitioner based on XXHash algorithm.
> There are two branches, the first xxhash, extracts common parts with Murmur 
> as there is a lot of overlap between these two.
> The second branch just copies everything from Murmur and changes just bits 
> which are necessary.
> I am not sure what path we want to go with so I just provided both to easier 
> elaborate on.
> I have written a microbenchmark measuring both partitioners and XXHash 
> implementation is very fast, around 10x faster (on greater payloads). 
> Benchmark is included in xxhash-2 branch.
> https://github.com/instaclustr/cassandra/tree/xxhash-2
> https://github.com/instaclustr/cassandra/tree/xxhash
> {code:java}
> [java] Benchmark  (bufferSize)  Mode  Cnt 
>  Score   Error  Units
> [java] PartitionersBench.benchMurmur3Partitioner31  avgt   20
> 157.942 ± 0.110  ns/op
> [java] PartitionersBench.benchMurmur3Partitioner67  avgt   20
> 204.670 ± 0.152  ns/op
> [java] PartitionersBench.benchMurmur3Partitioner   131  avgt   20
> 361.068 ± 0.228  ns/op
> [java] PartitionersBench.benchMurmur3Partitioner   517  avgt   20   
> 1325.670 ± 1.255  ns/op
> [java] PartitionersBench.benchMurmur3Partitioner  1031  avgt   20   
> 2594.651 ± 2.725  ns/op
> [java] PartitionersBench.benchMurmur3Partitioner  2041  avgt   20   
> 5082.166 ± 1.721  ns/op
> [java] PartitionersBench.benchMurmur3Partitioner  4097  avgt   20  
> 10112.020 ± 3.637  ns/op
> [java] PartitionersBench.benchXXHashPartitioner 31  avgt   20 
> 40.650 ± 0.025  ns/op
> [java] PartitionersBench.benchXXHashPartitioner 67  avgt   20 
> 53.305 ± 0.035  ns/op
> [java] PartitionersBench.benchXXHashPartitioner131  avgt   20 
> 67.098 ± 0.057  ns/op
> [java] PartitionersBench.benchXXHashPartitioner517  avgt   20
> 150.415 ± 0.107  ns/op
> [java] PartitionersBench.benchXXHashPartitioner   1031  avgt   20
> 265.614 ± 0.140  ns/op
> [java] PartitionersBench.benchXXHashPartitioner   2041  avgt   20
> 365.796 

[jira] [Comment Edited] (CASSANDRA-16610) Implement XXHashPartitioner

2021-04-18 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17324441#comment-17324441
 ] 

Stefan Miklosovic edited comment on CASSANDRA-16610 at 4/18/21, 7:34 AM:
-

I dont think it is only about providing new parititioners without giving any 
"added value". Sure it is faster ... so what. Murmur is fast enough for the 
job. It makes sense to make it pluggable (as it is now, by implementing 
IPartitioner) but it makes sense to code partitioners which are physically 
partitioning data every time differently and each partitioner which already 
exists gives you different behaviour so you can model your data as its 
placement around the cluster as you please.

But coding the partitioner which does nothing else, philosophically, as Murmur, 
provides more or less nothing on top of what we have already. It just 
introduces another piece of code to take care of and so on ... 


was (Author: stefan.miklosovic):
I dont think it is only about providing new parititioners without giving any 
"added value". Sure it is faster ... so what. Murmur is fast enough for the 
job. It makes sense to make it pluggable (as it is now, for example by 
implementing IPartitioner) but it makes sense to code partitioners which are 
physically partition data every time differently and each partitioner which 
already exists gives you different behaviour so you can model your data as its 
placement around the cluster as you please.

But coding the partitioner which does nothing else, philosophically, as Murmur, 
provides more or less nothing on top of what we have already. It just 
introduces another piece of code to take care of and so on ... 

> Implement XXHashPartitioner
> ---
>
> Key: CASSANDRA-16610
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16610
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Legacy/Core
>Reporter: Stefan Miklosovic
>Priority: Normal
> Attachments: jmh-result.json
>
>
> I implemented partitioner based on XXHash algorithm.
> There are two branches, the first xxhash, extracts common parts with Murmur 
> as there is a lot of overlap between these two.
> The second branch just copies everything from Murmur and changes just bits 
> which are necessary.
> I am not sure what path we want to go with so I just provided both to easier 
> elaborate on.
> I have written a microbenchmark measuring both partitioners and XXHash 
> implementation is very fast, around 10x faster (on greater payloads). 
> Benchmark is included in xxhash-2 branch.
> https://github.com/instaclustr/cassandra/tree/xxhash-2
> https://github.com/instaclustr/cassandra/tree/xxhash
> {code:java}
> [java] Benchmark  (bufferSize)  Mode  Cnt 
>  Score   Error  Units
> [java] PartitionersBench.benchMurmur3Partitioner31  avgt   20
> 157.942 ± 0.110  ns/op
> [java] PartitionersBench.benchMurmur3Partitioner67  avgt   20
> 204.670 ± 0.152  ns/op
> [java] PartitionersBench.benchMurmur3Partitioner   131  avgt   20
> 361.068 ± 0.228  ns/op
> [java] PartitionersBench.benchMurmur3Partitioner   517  avgt   20   
> 1325.670 ± 1.255  ns/op
> [java] PartitionersBench.benchMurmur3Partitioner  1031  avgt   20   
> 2594.651 ± 2.725  ns/op
> [java] PartitionersBench.benchMurmur3Partitioner  2041  avgt   20   
> 5082.166 ± 1.721  ns/op
> [java] PartitionersBench.benchMurmur3Partitioner  4097  avgt   20  
> 10112.020 ± 3.637  ns/op
> [java] PartitionersBench.benchXXHashPartitioner 31  avgt   20 
> 40.650 ± 0.025  ns/op
> [java] PartitionersBench.benchXXHashPartitioner 67  avgt   20 
> 53.305 ± 0.035  ns/op
> [java] PartitionersBench.benchXXHashPartitioner131  avgt   20 
> 67.098 ± 0.057  ns/op
> [java] PartitionersBench.benchXXHashPartitioner517  avgt   20
> 150.415 ± 0.107  ns/op
> [java] PartitionersBench.benchXXHashPartitioner   1031  avgt   20
> 265.614 ± 0.140  ns/op
> [java] PartitionersBench.benchXXHashPartitioner   2041  avgt   20
> 365.796 ± 0.225  ns/op
> [java] PartitionersBench.benchXXHashPartitioner   4097  avgt   20
> 925.841 ± 0.664  ns/op
> {code}
> {code:java}
> [java] PartitionersBench.benchMurmur3Partitioner 3  avgt5  
> 44.516 ± 0.345  ns/op
> [java] PartitionersBench.benchMurmur3Partitioner 5  avgt5  
> 54.930 ± 0.450  ns/op
> [java] PartitionersBench.benchMurmur3Partitioner 7  avgt5  
> 63.428 ± 0.266  ns/op
> [java] PartitionersBench.benchMurmur3Partitioner 9  avgt5  
> 69.456 ± 0.467  ns/op
> [java] PartitionersBench.benchMurmur3Partitioner11  avgt5  
> 81.411 ± 0.535  ns/op
> [java] 

[jira] [Commented] (CASSANDRA-16610) Implement XXHashPartitioner

2021-04-18 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17324441#comment-17324441
 ] 

Stefan Miklosovic commented on CASSANDRA-16610:
---

I dont think it is only about providing new parititioners without giving any 
"added value". Sure it is faster ... so what. Murmur is fast enough for the 
job. It makes sense to make it pluggable (as it is now, for example by 
implementing IPartitioner) but it makes sense to code partitioners which are 
physically partition data every time differently and each partitioner which 
already exists gives you different behaviour so you can model your data as its 
placement around the cluster as you please.

But coding the partitioner which does nothing else, philosophically, as Murmur, 
provides more or less nothing on top of what we have already. It just 
introduces another piece of code to take care of and so on ... 

> Implement XXHashPartitioner
> ---
>
> Key: CASSANDRA-16610
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16610
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Legacy/Core
>Reporter: Stefan Miklosovic
>Priority: Normal
> Attachments: jmh-result.json
>
>
> I implemented partitioner based on XXHash algorithm.
> There are two branches, the first xxhash, extracts common parts with Murmur 
> as there is a lot of overlap between these two.
> The second branch just copies everything from Murmur and changes just bits 
> which are necessary.
> I am not sure what path we want to go with so I just provided both to easier 
> elaborate on.
> I have written a microbenchmark measuring both partitioners and XXHash 
> implementation is very fast, around 10x faster (on greater payloads). 
> Benchmark is included in xxhash-2 branch.
> https://github.com/instaclustr/cassandra/tree/xxhash-2
> https://github.com/instaclustr/cassandra/tree/xxhash
> {code:java}
> [java] Benchmark  (bufferSize)  Mode  Cnt 
>  Score   Error  Units
> [java] PartitionersBench.benchMurmur3Partitioner31  avgt   20
> 157.942 ± 0.110  ns/op
> [java] PartitionersBench.benchMurmur3Partitioner67  avgt   20
> 204.670 ± 0.152  ns/op
> [java] PartitionersBench.benchMurmur3Partitioner   131  avgt   20
> 361.068 ± 0.228  ns/op
> [java] PartitionersBench.benchMurmur3Partitioner   517  avgt   20   
> 1325.670 ± 1.255  ns/op
> [java] PartitionersBench.benchMurmur3Partitioner  1031  avgt   20   
> 2594.651 ± 2.725  ns/op
> [java] PartitionersBench.benchMurmur3Partitioner  2041  avgt   20   
> 5082.166 ± 1.721  ns/op
> [java] PartitionersBench.benchMurmur3Partitioner  4097  avgt   20  
> 10112.020 ± 3.637  ns/op
> [java] PartitionersBench.benchXXHashPartitioner 31  avgt   20 
> 40.650 ± 0.025  ns/op
> [java] PartitionersBench.benchXXHashPartitioner 67  avgt   20 
> 53.305 ± 0.035  ns/op
> [java] PartitionersBench.benchXXHashPartitioner131  avgt   20 
> 67.098 ± 0.057  ns/op
> [java] PartitionersBench.benchXXHashPartitioner517  avgt   20
> 150.415 ± 0.107  ns/op
> [java] PartitionersBench.benchXXHashPartitioner   1031  avgt   20
> 265.614 ± 0.140  ns/op
> [java] PartitionersBench.benchXXHashPartitioner   2041  avgt   20
> 365.796 ± 0.225  ns/op
> [java] PartitionersBench.benchXXHashPartitioner   4097  avgt   20
> 925.841 ± 0.664  ns/op
> {code}
> {code:java}
> [java] PartitionersBench.benchMurmur3Partitioner 3  avgt5  
> 44.516 ± 0.345  ns/op
> [java] PartitionersBench.benchMurmur3Partitioner 5  avgt5  
> 54.930 ± 0.450  ns/op
> [java] PartitionersBench.benchMurmur3Partitioner 7  avgt5  
> 63.428 ± 0.266  ns/op
> [java] PartitionersBench.benchMurmur3Partitioner 9  avgt5  
> 69.456 ± 0.467  ns/op
> [java] PartitionersBench.benchMurmur3Partitioner11  avgt5  
> 81.411 ± 0.535  ns/op
> [java] PartitionersBench.benchMurmur3Partitioner16  avgt5  
> 68.621 ± 0.417  ns/op
> [java] PartitionersBench.benchXXHashPartitioner  3  avgt5  
> 26.820 ± 0.271  ns/op
> [java] PartitionersBench.benchXXHashPartitioner  5  avgt5  
> 28.182 ± 0.139  ns/op
> [java] PartitionersBench.benchXXHashPartitioner  7  avgt5  
> 31.557 ± 0.161  ns/op
> [java] PartitionersBench.benchXXHashPartitioner  9  avgt5  
> 31.017 ± 0.212  ns/op
> [java] PartitionersBench.benchXXHashPartitioner 11  avgt5  
> 33.233 ± 0.136  ns/op
> [java] PartitionersBench.benchXXHashPartitioner 16  avgt5  
> 31.386 ± 0.128  ns/op
> {code}
> https://github.com/OpenHFT/Zero-Allocation-Hashing
> https://cyan4973.github.io/xxHash/