Robert Coli created CASSANDRA-8703:
--------------------------------------
Summary: incremental repair vs. bitrot
Key: CASSANDRA-8703
URL: https://issues.apache.org/jira/browse/CASSANDRA-8703
Project: Cassandra
Issue Type: Bug
Reporter: Robert Coli
Incremental repair is a great improvement in Cassandra, but it does not contain
a feature that non-incremental repair does : protection against bitrot.
Scenario :
1) repair SSTable, marking it repaired
2) cosmic ray hits hard drive, corrupting a record in SSTable
3) range is actually unrepaired as of the time that SSTable was repaired, but
thinks it is repaired
>From my understanding, if bitrot is detected (via eg the CRC on the read path)
>then all SSTables containing the corrupted range needs to be marked unrepaired
>on all replicas. Per marcuse@IRC, the naive/simplest response would be to just
>trigger a full repair in this case.
I am concerned about incremental repair as an operational default while it does
not handle this case. As an aside, this would also seem to require a new CRC on
the uncompressed read path, as otherwise one cannot detect the corruption
without periodic checksumming of SSTables. Alternately, a "nodetool checksum"
function which verified table checksums, marking ranges unrepaired on failure,
and which could be run every gc_grace_seconds would seem to meet the
requirement.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)