[
https://issues.apache.org/jira/browse/CASSANDRA-13896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16441314#comment-16441314
]
Michael Burman commented on CASSANDRA-13896:
--------------------------------------------
[~aweisberg] I think my patch was misunderstood. It's meant to rather prove
that this isn't the contention point that Cassandra is hitting at this point.
If it were, the patch would at least increase performance somewhat, since it
reduces the contention on that single CAS. But there's no effect at all to
performance.
The most probable reason of scaling bottleneck is long before this place in the
pipeline and it's most likely somewhere in our threading model executors, we
can't process enough data from the network. There is a very large change in CPI
(and reduce in front-end bound) when batching multiple data points. This
reduces the hits to do_futex for example considerably and perf-map-agent shows
reduced hits to our SharedExecutorPool's CSLM. We probably do either too much
in the Netty pipeline before sending off the work or the pipeline for mutations
is simply too long to be effective (instead of smaller parts where we could
reuse the same thread and potentially same L1/L2/L3 cache hits for better
performance). But I have no patch for that yet, profilers are not exactly
brilliant in showing hotpoint when it's about "not doing anything".
And you're right about contended annotation, I only tried several places to
make sure no false sharing was responsible for no added performance. That place
just happened to be in the pushed commit as I wasn't intending this to be the
fix but instead to prove this isn't the correct scaling limitation - yet.
> Improving Cassandra write performance
> ---------------------------------------
>
> Key: CASSANDRA-13896
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13896
> Project: Cassandra
> Issue Type: Improvement
> Components: Local Write-Read Paths
> Environment: Skylake server with 2 sockets, 192GB RAM, 3xPCIe SSDs
> OS: Centos 7.3
> Java: Oracle JDK1.8.0_121
> Reporter: Prince Nana Owusu Boateng
> Priority: Major
> Labels: Performance
> Fix For: 4.x
>
> Attachments: Screen Shot 2017-09-22 at 11.22.43 AM.png, Screen Shot
> 2017-09-22 at 3.31.09 PM.png
>
>
> During our Cassandra performance testing, we see high percentage of the CPU
> spent in *org.apache.cassandra.utils.memory.SlabAllocator.allocate(int,
> OpOrder Group) * method. Appears to be high contention of the
> *<nextFreeOffset>* atomic Integer in write workloads. This structure is
> used by the threads for keeping track of the region bytebuffer allocation.
> When the contention appears, adding more clients, modifications of write
> specific parameters does not change write throughput performance. Attached
> are the details of Java Flight Recorder (JFR), showing hot functions and also
> performance results. When we see this contention, we still have plenty of
> CPU and throughput left ( *<20%* Total average CPU utilization and *<11%*
> of the storage write total throughput). This occurs on Cassandra 3.10.0-src
> version using the Cassandra-Stress.
> Proposal:
> We will like to introduce a solution which eliminates the atomic operations
> on the *<nextFreeOffset>* atomic Integer. This implementation will allow
> concurrent allocation of bytebuffers without an atomic compareAndSet and
> incrementAndGet operations. The solution is expected to increase overall
> write performance while improving CPU utilization.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]