Paulo Motta created CASSANDRA-21115:
---------------------------------------

             Summary: Auto-repair skips incomplete first repair after node 
restart due to ordering of checks
                 Key: CASSANDRA-21115
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-21115
             Project: Apache Cassandra
          Issue Type: Bug
            Reporter: Paulo Motta


When a node starts its very first auto-repair and crashes before completing it, 
the repair won't be resumed properly after restart. Instead, it gets skipped by 
the "too soon to repair" check for up to 24 hours.

*What happens*

  1. Node joins the cluster, no repair history exists yet
  2. insertNewRepairHistory() creates a record with both repair_start_ts and 
repair_finish_ts set to the current time (let's call it T1)
  3. When repair actually starts, only repair_start_ts gets updated to T2
  4. Node crashes mid-repair
  5. On restart, tooSoonToRunRepair() is called before myTurnToRunRepair()
  6. It queries repair_finish_ts which is still T1 (the record creation time, 
not an actual repair completion)
  7. If less than 24h have passed since T1, the check returns "too soon" and 
bails out
  8. The logic in myTurnToRunRepair() that detects ongoing repairs 
(repair_start_ts > repair_finish_ts) never gets a chance to run

*Expected behavior*

  A repair that was in progress should be resumed after restart, regardless of 
the min_repair_interval setting. The "too soon" check should not apply to 
incomplete repairs.



 *How to reproduce*

  1. Set up a fresh node with auto-repair enabled
  2. Wait for the first repair to start
  3. Kill the node before repair completes
  4. Restart the node within 24 hours
  5. Observe that repair is skipped with "Too soon to run repair" in the logs



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to