[ 
https://issues.apache.org/jira/browse/CASSANDRA-14747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16634851#comment-16634851
 ] 

Joseph Lynch commented on CASSANDRA-14747:
------------------------------------------

Ah yea I see that's a problem. I worked around it by making a new callback just 
for that case. While I was testing it out I also tested flushing 
unconditionally 
[https://gist.github.com/jolynch/966e0e52f34eff7a7b8ac8d5a9cb4b5d#file-some-more-tweaks-diff,]
 and CPU usage dropped by about half and the flamegraph looks _excellent_.

I've attached the flamegraph as [^4.0.12-after-unconditional-flush.svg], where 
we can see that after the unconditional flush we are spending less than 7% CPU 
usage now! (compared to like 70%). I think that with 198 other nodes we were 
spending a lot of time waiting with data in the channel that's unflushed 
because well there are 195 other queues that get to be serviced before you get 
serviced again and fill up the channel.

We're not done yet as we still have dropped messages (vs 3.0 which has very few 
if any dropped), but this is much better. 

> Evaluate 200 node, compression=none, encryption=none, coalescing=off 
> ---------------------------------------------------------------------
>
>                 Key: CASSANDRA-14747
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-14747
>             Project: Cassandra
>          Issue Type: Sub-task
>            Reporter: Joseph Lynch
>            Assignee: Joseph Lynch
>            Priority: Major
>         Attachments: 3.0.17-QPS.png, 4.0.1-QPS.png, 
> 4.0.11-after-jolynch-tweaks.svg, 4.0.12-after-unconditional-flush.svg, 
> 4.0.7-before-my-changes.svg, 4.0_errors_showing_heap_pressure.txt, 
> 4.0_heap_histogram_showing_many_MessageOuts.txt, 
> i-0ed2acd2dfacab7c1-after-looping-fixes.svg, 
> ttop_NettyOutbound-Thread_spinning.txt, 
> useast1c-i-0e1ddfe8b2f769060-mutation-flame.svg, 
> useast1e-i-08635fa1631601538_flamegraph_96node.svg, 
> useast1e-i-08635fa1631601538_ttop_netty_outbound_threads_96nodes, 
> useast1e-i-08635fa1631601538_uninlinedcpuflamegraph.0_96node_60sec_profile.svg
>
>
> Tracks evaluating a 200 node cluster with all internode settings off (no 
> compression, no encryption, no coalescing).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to