Jimmy Mårdell created CASSANDRA-8312:
----------------------------------------
Summary: Use live sstables in snapshot repair if possible
Key: CASSANDRA-8312
URL: https://issues.apache.org/jira/browse/CASSANDRA-8312
Project: Cassandra
Issue Type: Improvement
Reporter: Jimmy Mårdell
Priority: Minor
Snapshot repair can be very much slower than parallel repairs because of the
overhead of opening the SSTables in the snapshot. This is particular true when
using LCS, as you typically have many smaller SSTables then.
I compared parallel and sequential repair on a small range on one of our
clusters (2*3 replicas). With parallel repair, this took 22 seconds. With
sequential repair (default in 2.0), the same range took 330 seconds! This is an
overhead of 330-22*6 = 198 seconds, just opening SSTables (there were 1000+
sstables). Also, opening 1000 sstables for many smaller rangers surely causes
lots of memory churning.
The idea would be to list the sstables in the snapshot, but use the
corresponding sstables in the live set if it's still available. For almost all
sstables, the original one should still exist.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)