[
https://issues.apache.org/jira/browse/CASSANDRA-13896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16421472#comment-16421472
]
Michael Burman commented on CASSANDRA-13896:
--------------------------------------------
I'd like to be proven wrong, but I think this is a profiling error (maybe
safepoint bias). I can't repeat that profiling output, but I can see the same
limitation in throughput and the limitation hits me at around 9 cores (with 128
concurrent writers made no difference). But since profilers can be wrong, I
also wrote a small patch for this issue that removes the (very real contention
yes, but we don't hit it yet) contention from that single Region:
[https://github.com/burmanm/cassandra/commit/69f2e492bb7c49d4411124ec45ebf2dd31b153bc]
But I could not see any difference, so I doubt this is the real bottleneck at
this point.
> Improving Cassandra write performance
> ---------------------------------------
>
> Key: CASSANDRA-13896
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13896
> Project: Cassandra
> Issue Type: Improvement
> Components: Local Write-Read Paths
> Environment: Skylake server with 2 sockets, 192GB RAM, 3xPCIe SSDs
> OS: Centos 7.3
> Java: Oracle JDK1.8.0_121
> Reporter: Prince Nana Owusu Boateng
> Priority: Major
> Labels: Performance
> Fix For: 4.x
>
> Attachments: Screen Shot 2017-09-22 at 11.22.43 AM.png, Screen Shot
> 2017-09-22 at 3.31.09 PM.png
>
>
> During our Cassandra performance testing, we see high percentage of the CPU
> spent in *org.apache.cassandra.utils.memory.SlabAllocator.allocate(int,
> OpOrder Group) * method. Appears to be high contention of the
> *<nextFreeOffset>* atomic Integer in write workloads. This structure is
> used by the threads for keeping track of the region bytebuffer allocation.
> When the contention appears, adding more clients, modifications of write
> specific parameters does not change write throughput performance. Attached
> are the details of Java Flight Recorder (JFR), showing hot functions and also
> performance results. When we see this contention, we still have plenty of
> CPU and throughput left ( *<20%* Total average CPU utilization and *<11%*
> of the storage write total throughput). This occurs on Cassandra 3.10.0-src
> version using the Cassandra-Stress.
> Proposal:
> We will like to introduce a solution which eliminates the atomic operations
> on the *<nextFreeOffset>* atomic Integer. This implementation will allow
> concurrent allocation of bytebuffers without an atomic compareAndSet and
> incrementAndGet operations. The solution is expected to increase overall
> write performance while improving CPU utilization.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]