[ https://issues.apache.org/jira/browse/CASSANDRA-8312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14239488#comment-14239488 ]
Jimmy Mårdell commented on CASSANDRA-8312: ------------------------------------------ Were the references you refer to added in 2.1? I notice that 2.1 does a SSTableReader.releaseReferences(sstables); on the snapshots sstables which (before my patch) wasn't the case on 2.0. If so, it should be enough to just remove the row sstable.acquireReference(); in getSnapshotSSTableReader on the 2.1/trunk branches. > Use live sstables in snapshot repair if possible > ------------------------------------------------ > > Key: CASSANDRA-8312 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8312 > Project: Cassandra > Issue Type: Improvement > Reporter: Jimmy Mårdell > Assignee: Jimmy Mårdell > Priority: Minor > Fix For: 2.0.12, 3.0, 2.1.3 > > Attachments: cassandra-2.0-8312-1.txt > > > Snapshot repair can be very much slower than parallel repairs because of the > overhead of opening the SSTables in the snapshot. This is particular true > when using LCS, as you typically have many smaller SSTables then. > I compared parallel and sequential repair on a small range on one of our > clusters (2*3 replicas). With parallel repair, this took 22 seconds. With > sequential repair (default in 2.0), the same range took 330 seconds! This is > an overhead of 330-22*6 = 198 seconds, just opening SSTables (there were > 1000+ sstables). Also, opening 1000 sstables for many smaller rangers surely > causes lots of memory churning. > The idea would be to list the sstables in the snapshot, but use the > corresponding sstables in the live set if it's still available. For almost > all sstables, the original one should still exist. -- This message was sent by Atlassian JIRA (v6.3.4#6332)