[ 
https://issues.apache.org/jira/browse/CASSANDRA-21138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18055565#comment-18055565
 ] 

Paulo Motta commented on CASSANDRA-21138:
-----------------------------------------

Since cluster metadata in 5.0 is more similar to 4.1 than trunk due to TCM, I 
decided to forward port this from jaydeep's 4.1 backport 
([https://github.com/jaydeepkumar1984/cassandra/tree/auto_repair_v2_on_4_1/]) 
adapting the implementation to 5.0 where needed, for example the RepairRunnable 
class has been renamed to RepairCoordinator, etc.

Another benefit of forward porting from Jaydeep's branch is that it was easier 
to include other autorepair improvements that were present in the branch:
 - Improved observability in AutoRepair (CASSANDRA-20581)
 - Stop repair scheduler if two major versions detected (CASSANDRA-20048)
 - Safeguard Full repair against disk protection (CASSANDRA-20045)
 - Stop AutoRepair monitoring thread upon shutdown (CASSANDRA-20623)
 - Fix race condition in auto-repair scheduler (CASSANDRA-20265)
 - Minimum repair task duration setting (CASSANDRA-20160)
 - Preview_repaired auto-repair type (CASSANDRA-20046)

Due to CASSANDRA-17056 presence in 5.0, the token range size estimation was 
simplified in comparison to 4.1 to use the SSTable API directly 
(onDiskSizeForPartitionPositions() and getPositionsForRanges()) in the token 
range splitters versus manipulating SSTableScanner directly, so the 
implementation of this part will match trunk's implementation. Other than this 
there were no notable differences in the feature port.

There are also the following changes in comparison to the trunk/4.1 
implementation:
a) Disable repair scheduling when in mixed mode with versions equal or lower 
than 5.0.6 (assuming this will be added to 5.0.7) 
([commit|https://github.com/apache/cassandra/pull/4558/changes/ea9cdab0396fac06f2f7c7c51cc808864e0e2a6a])
b) Gate schema changes and autorepair module initialization on a JVM property 
"cassandra.autorepair.enable", disabled by default, to reduce upgrade risk to 
users who do not intend to enable this feature. 
([commit|https://github.com/apache/cassandra/pull/4558/changes/4ebf8919397882063bd58cc4d51d01657158b3b3])
c) Disable repair_disk_headroom_reject_ratio by default to avoid introducing 
breaking changes 
([commit|https://github.com/apache/cassandra/pull/4558/changes/88214fb201d57139926b354c5adb67847fd639a9])
d) Due to the feature flag being disabled by default, 
AutoRepairTablePropertyTest and DescribeSchemaTest autorepair tests stopped 
working due to the schema being statically initialized not including the 
auto_repair schema column, and since these tests were testing the inclusion of 
the column when autorepair was enabled/disabled I ported them over to 
AutoRepairTablePropertyDTest and AutoRepairFlagToggleTest dtests where it's 
possible to manipulate the jvm property before the schema is statically 
initialized.

I added a new upgrade test AutoRepairDisabledSchemaUpgradeTest to check that 
the there is no schema mismatch when a node upgrade from 5.0.6 to 5.0.7 with 
the feature flag disabled.

I wanted to note that the {{nodetool sstablerepairedset}} command even though 
unrelated to the autorepair feature per se was included in the original patch 
and was also included here for completeness and also because it's low risk to 
include this change in 5.0.

[https://pre-ci.cassandra.apache.org/job/cassandra-5.0/71/]

> Backport Unified Repair Solution (CEP-37) to Cassandra 5.0
> ----------------------------------------------------------
>
>                 Key: CASSANDRA-21138
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-21138
>             Project: Apache Cassandra
>          Issue Type: Improvement
>          Components: Consistency/Repair
>            Reporter: Paulo Motta
>            Assignee: Paulo Motta
>            Priority: Normal
>          Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Backport Unified Repair Solution 
> ([CEP-37|https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-37%3A+Apache+Cassandra+Unified+Repair+Solution]
>  / CASSANDRA-19918) to Cassandra 5.0.
> This should be an optional feature enabled via feature flag.
> Mailing list discussions:
>  - [https://lists.apache.org/thread/tn0ov4d61n4r9wsxd9ob890oxb3joh41]
>  - https://lists.apache.org/thread/snss0l76bxmg09vs1wdrcvt37nckh0vw



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to