[ 
https://issues.apache.org/jira/browse/CASSANDRA-13354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15939107#comment-15939107
 ] 

Jan Karlsson commented on CASSANDRA-13354:
------------------------------------------

I did some tests simulating traffic on a 4 node cluster. 2 of the nodes were 
running with my patch while the other two ran without it.
Steps to reproduce:
Traffic on
Turn one of the nodes off
Wait 7 minutes
Truncate hints on all other nodes
Turn node on
Run repair on the node

As you can see the unpatched version kept increasing as non-repaired data from 
ongoing traffic was prioritized. If I had more discrepancies in my data set, 
this would just increase to the configured FD limit or until you die from heap 
pressure.

Repair is completed at 8:11pm but those small repaired files are not compacted 
as it picks unrepaired new sstables over the small repaired sstables. However, 
it did show a downwards trend as compaction was slightly faster than insertion 
and would probably eventually end with the repaired files compacted.

During the unpatched test, it only showed 2 pending compactions with 22k~ file 
descriptors open/10k~ sstables. At 8:33pm I disabled the traffic completely to 
hurry this along.
SSTables in each level: [10347/4, 5, 0, 0, 0, 0, 0, 0, 0]

> LCS estimated compaction tasks does not take number of files into account
> -------------------------------------------------------------------------
>
>                 Key: CASSANDRA-13354
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13354
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Compaction
>         Environment: Cassandra 2.2.9
>            Reporter: Jan Karlsson
>            Assignee: Jan Karlsson
>         Attachments: 13354-trunk.txt, patchedTest.png, unpatchedTest.png
>
>
> In LCS, the way we estimate number of compaction tasks remaining for L0 is by 
> taking the size of a SSTable and multiply it by four. This would give 4*160mb 
> with default settings. This calculation is used to determine whether repaired 
> or repaired data is being compacted.
> Now this works well until you take repair into account. Repair streams over 
> many many sstables which could be smaller than the configured SSTable size 
> depending on your use case. In our case we are talking about many thousands 
> of tiny SSTables. As number of files increases one can run into any number of 
> problems, including GC issues, too many open files or plain increase in read 
> latency.
> With the current algorithm we will choose repaired or unrepaired depending on 
> whichever side has more data in it. Even if the repaired files outnumber the 
> unrepaired files by a large margin.
> Similarily, our algorithm that selects compaction candidates takes up to 32 
> SSTables at a time in L0, however our estimated task calculation does not 
> take this number into account. These two mechanisms should be aligned with 
> each other.
> I propose that we take the number of files in L0 into account when estimating 
> remaining tasks. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to