Differentiate manual repair sessions from automatic
---------------------------------------------------
Key: CASSANDRA-1190
URL: https://issues.apache.org/jira/browse/CASSANDRA-1190
Project: Cassandra
Issue Type: Bug
Reporter: Stu Hood
Priority: Critical
Fix For: 0.6.3, 0.7
Currently both manual and automatic repair sessions use the same timeout value:
TREE_STORE_TIMEOUT. This has the very negative effect of setting a maximum time
that compaction can take before a manual repair will fail.
For automatic/natural repairs (triggered by two nodes autonomously finishing
major compactions around the same time), you want a relatively low
TREE_STORE_TIMEOUT value, because trees generated a long time apart will cause
a lot of unnecessary repair. The current value is 10 minutes, to optimize for
this case.
On the other hand, for manual repairs, TREE_STORE_TIMEOUT needs to be
significantly higher. For instance, if a manual repair is triggered for a
source node A storing 2 TB of data, and a destination node B with an empty
store, then node B needs to wait long enough for node A to finish compacting 2
TB of data, which might take > 12 hours. If a node B times out the local tree
before node A sends its tree, then the repair will not occur.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.