[ 
https://issues.apache.org/jira/browse/CASSANDRA-14605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16559654#comment-16559654
 ] 

Benedict commented on CASSANDRA-14605:
--------------------------------------

Probably the issue is that, with an LCS major compaction, we do a great deal 
that is unnecessary. By definition most of the sstables will not intersect with 
the recently modified position of the latest sstable, and by default LCS has 
very small sstables - so there are a great many will be unnecessarily looping 
over.

I think we can improve the status quo quite straightforwardly, but I think we 
should probably revisit the whole approach of managing the key cache here once 
we have done so, as this code has been around since time immemorial, and may 
not translate to our current architecture so well.

 

> Major compaction of LCS tables very slow
> ----------------------------------------
>
>                 Key: CASSANDRA-14605
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-14605
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Compaction
>         Environment: AWS, i3.4xlarge instance (very fast local nvme storage), 
> Linux 4.13
> Cassandra 3.0.16
>            Reporter: Joseph Lynch
>            Assignee: Benedict
>            Priority: Minor
>              Labels: lcs, performance
>         Attachments: slow_major_compaction_lcs.svg
>
>
> We've recently started deploying 3.0.16 more heavily in production and today 
> I noticed that full compaction of LCS tables takes a much longer time than it 
> should. In particular it appears to be faster to convert a large dataset to 
> STCS, run full compaction, and then convert it to LCS (with re-leveling) than 
> it is to just run full compaction on LCS (with re-leveling).
> I was able to get a CPU flame graph showing 50% of the major compaction's cpu 
> time being spent in 
> [{{SSTableRewriter::maybeReopenEarly}}|https://github.com/apache/cassandra/blob/6ba2fb9395226491872b41312d978a169f36fcdb/src/java/org/apache/cassandra/io/sstable/SSTableRewriter.java#L184]
>  calling 
> [{{SSTableRewriter::moveStarts}}|https://github.com/apache/cassandra/blob/6ba2fb9395226491872b41312d978a169f36fcdb/src/java/org/apache/cassandra/io/sstable/SSTableRewriter.java#L223].
> I've attached the flame graph here which was generated by running Cassandra 
> using {{-XX:+PreserveFramePointer}}, then using jstack to get the compaction 
> native thread id (nid) which I then used perf to get on cpu time:
> {noformat}
> perf record -t <compaction thread> -o <output file> -F 49 -g sleep 60 
> >/dev/null
> {noformat}
> I took this data and collapsed it using the steps talked about in [Brendan 
> Gregg's java in flames 
> blogpost|https://medium.com/netflix-techblog/java-in-flames-e763b3d32166] 
> (Instructions section) to generate the graph.
> The results are that at least on this dataset (700GB of data compressed, 
> 2.2TB uncompressed), we are spending 50% of our cpu time in {{moveStarts}} 
> and I am unsure that we need to be doing that as frequently as we are. I'll 
> see if I can come up with a clean reproduction to confirm if it's a general 
> problem or just on this particular dataset.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to