John Sumsion created CASSANDRA-8169:
---------------------------------------
Summary: Background bitrot detector to avoid client exposure
Key: CASSANDRA-8169
URL: https://issues.apache.org/jira/browse/CASSANDRA-8169
Project: Cassandra
Issue Type: New Feature
Reporter: John Sumsion
With a lot of static data sitting in SSTables, and with only a relatively small
add/edit rate, incremental repair sounds very good. However, there is one
significant cost to switching away from full repair.
If/when bitrot corrupts an SSTable, there is nothing standing between a user
query and a corruption/failure-response event except for the other replicas.
This combined with a rolling restart or upgrade can make a token range
non-writable via quorum CL.
While you could argue that full repairs should be scheduled on a longer-term
regular basis, I don't really care about all the repair overhead, I just want
something that can run ahead of user queries whose only responsibility is to
detect bitrot, so that I can replace nodes in an aggressive way instead of
having it be a failure-response situation.
This bitrot detector need not incur the full cross-cluster cost of repair, and
so would be less of a burden to run periodically.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)