[ https://issues.apache.org/jira/browse/CASSANDRA-13354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15932799#comment-15932799 ]
Jan Karlsson edited comment on CASSANDRA-13354 at 3/20/17 3:45 PM: ------------------------------------------------------------------- Added patch on 4.0 to fix this. Applies cleanly to other versions as well(tested 2.2.9). I have tested this in a cluster and will upload some graphs as well. Comments and suggestions welcome! was (Author: jan karlsson): Added patch on 4.0 to fix this. Should be pretty minimal work to get this to apply to other versions as well. I have tested this in a cluster and will upload some graphs as well. Comments and suggestions welcome! > LCS estimated compaction tasks does not take number of files into account > ------------------------------------------------------------------------- > > Key: CASSANDRA-13354 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13354 > Project: Cassandra > Issue Type: Bug > Components: Compaction > Environment: Cassandra 2.2.9 > Reporter: Jan Karlsson > Assignee: Jan Karlsson > Attachments: 13354-trunk.txt > > > In LCS, the way we estimate number of compaction tasks remaining for L0 is by > taking the size of a SSTable and multiply it by four. This would give 4*160mb > with default settings. This calculation is used to determine whether repaired > or repaired data is being compacted. > Now this works well until you take repair into account. Repair streams over > many many sstables which could be smaller than the configured SSTable size > depending on your use case. In our case we are talking about many thousands > of tiny SSTables. As number of files increases one can run into any number of > problems, including GC issues, too many open files or plain increase in read > latency. > With the current algorithm we will choose repaired or unrepaired depending on > whichever side has more data in it. Even if the repaired files outnumber the > unrepaired files by a large margin. > Similarily, our algorithm that selects compaction candidates takes up to 32 > SSTables at a time in L0, however our estimated task calculation does not > take this number into account. These two mechanisms should be aligned with > each other. > I propose that we take the number of files in L0 into account when estimating > remaining tasks. -- This message was sent by Atlassian JIRA (v6.3.15#6346)