Re: [UPDATE] CEP-37

Paulo Motta Wed, 23 Apr 2025 13:52:43 -0700

The long awaited feature of managed repairs is finally happening, this is
awesome! :)


Congrats all for this achievement!

On Wed, Apr 23, 2025 at 4:27 PM Jordan West <[email protected]> wrote:

> Great work all! Another awesome milestone and huge step forward for the
> project!
>
> On Wed, Apr 23, 2025 at 12:47 Jaydeep Chovatia <[email protected]>
> wrote:
>
>> The CEP-37 work has been successfully merged into the trunk today! Please
>> let me know if you have any issues.
>>
>> This merge is a massive win for Apache Cassandra — a significant step
>> forward. But we're not stopping here. There's more to come, and we are
>> committed to pushing repair automation even further and closing the gaps in
>> the remaining flows. A few examples:
>>
>>    1. Automatically running repair as part of the node replacement:
>>    Design
>>    
>> <https://docs.google.com/document/d/1SZIQPbIWNDsbWnIk5N5tyQCQzJ4ypwuhH-t5dO5WeZs/edit?tab=t.0>
>>    & POC <https://github.com/jaydeepkumar1984/cassandra/pull/54> is
>>    already out [CASSANDRA-20281
>>    <https://issues.apache.org/jira/browse/CASSANDRA-20281>]
>>    2. Stopping repair automatically between Cassandra major version
>>    upgrades [CASSANDRA-20048
>>    <https://issues.apache.org/jira/browse/CASSANDRA-20048>]
>>    3. Repairing automatically when Keyspace replication changes [
>>    CASSANDRA-20582
>>    <https://issues.apache.org/jira/browse/CASSANDRA-20582>]
>>
>> Thanks for all the help and support from the Apache Cassandra community!
>>
>> Yours sincerely,
>> Andy Tolbert, Chris Lohfink, Francisco Guerrero, Kristijonas Zalys, and
>> Jaydeep
>>
>> On Sun, Mar 9, 2025 at 8:53 PM Jaydeep Chovatia <
>> [email protected]> wrote:
>>
>>> Thanks a lot, Jon!
>>> This has truly been a team effort, with Andy Tolbert, Chris Lohfink,
>>> Francisco Guerrero, and Kristijonas Zalys all contributing over the past
>>> year. The credit belongs to everyone!
>>>
>>> Jaydeep
>>>
>>>
>>>
>>>
>>>
>>> On Sun, Mar 9, 2025 at 2:35 PM Jon Haddad <[email protected]>
>>> wrote:
>>>
>>>> This is all really exciting.  Getting a built in, orchestrated repair
>>>> is a massive achievement.  Thank you for your work on this, it's incredibly
>>>> valuable to the community!!
>>>>
>>>> Jon
>>>>
>>>> On Sun, Mar 9, 2025 at 2:25 PM Jaydeep Chovatia <
>>>> [email protected]> wrote:
>>>>
>>>>> No problem, Dave! Thank you.
>>>>>
>>>>> Jaydeep
>>>>>
>>>>> On Sun, Mar 9, 2025 at 10:46 AM Dave Herrington <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> Jaydeep,
>>>>>>
>>>>>> Thank you for taking time to answer my questions and for the links to
>>>>>> the design and overview docs, which are excellent and answer all of my
>>>>>> remaining questions.  Sorry I missed those links in the CEP page.
>>>>>>
>>>>>> Great work and I will continue to follow your progress on this
>>>>>> powerful new feature.
>>>>>>
>>>>>> Thanks!
>>>>>> -Dave
>>>>>>
>>>>>> On Sat, Mar 8, 2025 at 9:36 AM Jaydeep Chovatia <
>>>>>> [email protected]> wrote:
>>>>>>
>>>>>>> Hi David,
>>>>>>>
>>>>>>> Thanks for the kind words!
>>>>>>>
>>>>>>> >Is there a goal in this CEP to make automated repair work during
>>>>>>> rolling upgrades, when multiple versions exist in the cluster?
>>>>>>> We debated a lot on this over ASF Slack
>>>>>>> (#cassandra-repair-scheduling-cep37). The summary is that, ideally, we 
>>>>>>> want
>>>>>>> to have a repair function during the mixed version, but the reality is 
>>>>>>> that
>>>>>>> currently, there is no test suite available inside Apache Cassandra to
>>>>>>> verify the streaming behavior during the mixed version, so the 
>>>>>>> confidence
>>>>>>> is low.
>>>>>>> We agreed on the following: 1) Keeping safety in mind, we should by
>>>>>>> default disable the repair during mixed version 2) Add a comprehensive 
>>>>>>> test
>>>>>>> suite 3) Allow repair during mixed version. Currently, we are at #1
>>>>>>>
>>>>>>> >Would automated repair be smart enough to automatically stop, if it
>>>>>>> sees incompatible versions?
>>>>>>> That's the plan, and we already have PR (CASSANDRA-20048
>>>>>>> <https://issues.apache.org/jira/browse/CASSANDRA-20048>) out from
>>>>>>> Chris Lohfink. The thing we are debating is whether to stop only during
>>>>>>> major version mismatch or also during the minor version, and we are 
>>>>>>> leaning
>>>>>>> towards only disabling for the major version mismatch. Regardless, this
>>>>>>> should be available soon.
>>>>>>> We are also extending this further as per feedback from David
>>>>>>> Capwell that we should automatically stop repair if we detect a new DC 
>>>>>>> or
>>>>>>> keyspace RF is changed. That will be covered later as part of
>>>>>>> CASSANDRA-20414
>>>>>>> <https://issues.apache.org/jira/browse/CASSANDRA-20414>
>>>>>>>
>>>>>>> >If automated repair must be disabled for the entire cluster, will
>>>>>>> this be a single nodetool command, or must automated repair be disabled 
>>>>>>> on
>>>>>>> each node individually?
>>>>>>> Yes, it is a nodetool command and does not require any restarts! All
>>>>>>> the *nodetool* command details are currently covered in the design
>>>>>>> doc
>>>>>>> <https://docs.google.com/document/d/1CJWxjEi-mBABPMZ3VWJ9w5KavWfJETAGxfUpsViPcPo/edit?tab=t.0#heading=h.89fmsespiosd>,
>>>>>>> and the same details will also be available in the Cassandra
>>>>>>> overview.adoc
>>>>>>> <https://github.com/apache/cassandra/pull/3598/files?short_path=e901018#diff-e90101885c1188844bb4188d1301277bfdc4a9e1e705c4ab8a6cc5a4b44460c0>
>>>>>>> .
>>>>>>>
>>>>>>> >Would it make sense for automated repair to upgrade sstables, if it
>>>>>>> finds old formats? (Maybe this could be a feature that could be 
>>>>>>> optionally
>>>>>>> enabled?)
>>>>>>> My opinion is that it should not be part of the repair. It is best
>>>>>>> suited as part of the Cassandra upgrade framework; I guess Paulo M is
>>>>>>> looking at it.
>>>>>>>
>>>>>>> >W.R.T. the repair logging tables in the system_distributed
>>>>>>> keyspace, will these tables have a configurable TTL, or must they be
>>>>>>> periodically truncated to limit their size?
>>>>>>> The number of entries will equal the number of Cassandra nodes in a
>>>>>>> cluster. There is no TTL because each row represents the repair status 
>>>>>>> of
>>>>>>> that particular node. The entries would be automatically added/removed 
>>>>>>> as
>>>>>>> nodes are added/removed from the Cassandra cluster.
>>>>>>>
>>>>>>> Jaydeep
>>>>>>>
>>>>>>> On Sat, Mar 8, 2025 at 7:46 AM Dave Herrington <
>>>>>>> [email protected]> wrote:
>>>>>>>
>>>>>>>> Jaydeep,
>>>>>>>>
>>>>>>>> Thank you for your excellent efforts on this mission-critical
>>>>>>>> feature.  The stated goals of CEP-37 are noble and stand to make 
>>>>>>>> valuable
>>>>>>>> improvements for cluster operations.  I look forward to testing these 
>>>>>>>> new
>>>>>>>> capabilities.
>>>>>>>>
>>>>>>>> My apologies up-front if you’ve already answered these questions.
>>>>>>>> I did read the CEP a number of times and the linked JIRAs, but these 
>>>>>>>> are my
>>>>>>>> questions that I couldn’t answer myself.
>>>>>>>>
>>>>>>>> I’m interested to understand the goals of CEP-37 W.R.T. to rolling
>>>>>>>> upgrades of large clusters, as I am responsible for maintaining the 
>>>>>>>> cluster
>>>>>>>> operations runbooks for a number of customers.
>>>>>>>>
>>>>>>>> Operators have to navigate the upgrade gauntlet with automated
>>>>>>>> repairs disabled and get all nodes upgraded within gc_grace_seconds and
>>>>>>>> then do a full repair, before restarting automated repairs.
>>>>>>>>
>>>>>>>> I see that CASSANDRA-7530
>>>>>>>> https://issues.apache.org/jira/browse/CASSANDRA-7530 is related to
>>>>>>>> this.
>>>>>>>>
>>>>>>>> Is there a goal in this CEP to make automated repair work during
>>>>>>>> rolling upgrades, when multiple versions exist in the cluster?
>>>>>>>>
>>>>>>>> (I think this would imply that stopping automated repairs would no
>>>>>>>> longer be a pre-upgrade step.)
>>>>>>>>
>>>>>>>> Would automated repair be smart enough to automatically stop, if it
>>>>>>>> sees incompatible versions?
>>>>>>>>
>>>>>>>> Would automated repair continue between nodes with compatible
>>>>>>>> versions, or would it stop for the entire cluster?
>>>>>>>>
>>>>>>>> If automated repair must be disabled for the entire cluster, will
>>>>>>>> this be a single nodetool command, or must automated repair be 
>>>>>>>> disabled on
>>>>>>>> each node individually?
>>>>>>>>
>>>>>>>> Would it make sense for automated repair to upgrade sstables, if it
>>>>>>>> finds old formats? (Maybe this could be a feature that could be 
>>>>>>>> optionally
>>>>>>>> enabled?)
>>>>>>>>
>>>>>>>> W.R.T. the repair logging tables in the system_distributed
>>>>>>>> keyspace, will these tables have a configurable TTL, or must they be
>>>>>>>> periodically truncated to limit their size?
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> -Dave
>>>>>>>>
>>>>>>>> David A. Herrington II
>>>>>>>> President and Chief Engineer
>>>>>>>> RhinoSource, Inc.
>>>>>>>>
>>>>>>>> *Data Lake Architecture, Cloud Computing and Advanced Analytics.*
>>>>>>>>
>>>>>>>> www.rhinosource.com
>>>>>>>>
>>>>>>>>
>>>>>>>> On Fri, Mar 7, 2025 at 11:48 AM Jaydeep Chovatia <
>>>>>>>> [email protected]> wrote:
>>>>>>>>
>>>>>>>>> Hello Everyone,
>>>>>>>>>
>>>>>>>>> I wanted to update you on CEP-37
>>>>>>>>> <https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-37+Apache+Cassandra+Unified+Repair+Solution>
>>>>>>>>>  (Jira:
>>>>>>>>> CASSANDRA-19918
>>>>>>>>> <https://issues.apache.org/jira/browse/CASSANDRA-19918>) work.
>>>>>>>>> Over the last year, some of us (Andy Tolbert, Chris Lohfink,
>>>>>>>>> Francisco Guerrero, and Kristijonas Zalys) have been working closely 
>>>>>>>>> on
>>>>>>>>> making CEP-37 rock solid, with support from Josh McKenzie, Dinesh 
>>>>>>>>> Joshi,
>>>>>>>>> and David Capwell.
>>>>>>>>> First and foremost, a huge thank you to everyone, including the
>>>>>>>>> broader Apache Cassandra community, for their invaluable 
>>>>>>>>> contributions in
>>>>>>>>> making CEP-37 robust and solid!
>>>>>>>>>
>>>>>>>>> Here is the current status:
>>>>>>>>>
>>>>>>>>> *Feature stability*
>>>>>>>>>
>>>>>>>>>    - *Voted feature:* All the features mentioned in CEP-37 have
>>>>>>>>>    worked as expected.
>>>>>>>>>    - *Post-voted feature:* A few new minor improvements
>>>>>>>>>    
>>>>>>>>> <https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=272927365#CEP37ApacheCassandraUnifiedRepairSolution-Post-VoteUpdates>
>>>>>>>>>    have been added to post-voting, and they are also working as 
>>>>>>>>> expected.
>>>>>>>>>    - Tested the functionality by multiple people over the period
>>>>>>>>>    of time.
>>>>>>>>>    - Some other facts: it has already been validated at scale
>>>>>>>>>    <https://www.youtube.com/watch?v=xFicEj6Nhq8>. Another big
>>>>>>>>>    Cassandra use case is in the process of validating/adopting it in 
>>>>>>>>> their
>>>>>>>>>    environment.
>>>>>>>>>
>>>>>>>>> *Source Code*
>>>>>>>>>
>>>>>>>>>    - It is an opt-in feature; nobody notices anything unless
>>>>>>>>>    someone opts in.
>>>>>>>>>    - By default, this feature is pretty isolated (in a separate
>>>>>>>>>    package) from the source code point of view (94% of the source code
>>>>>>>>>    lines are in the new files)
>>>>>>>>>    - A thorough documentation has been added:
>>>>>>>>>       - overview.doc
>>>>>>>>>       - metrics.doc
>>>>>>>>>       - cassandra.yaml doc
>>>>>>>>>       - NEWS.txt overview
>>>>>>>>>    - Five people (Andy Tolbert, Chris Lohfink, Francisco
>>>>>>>>>    Guerrero, and Kristijonas Zalys) have contributed.
>>>>>>>>>    - The source code has been reviewed multiple times by the same
>>>>>>>>>    five people.
>>>>>>>>>
>>>>>>>>> *Test Coverage*
>>>>>>>>>
>>>>>>>>>    - A comprehensive test coverage has been added to cover all
>>>>>>>>>    aspects.
>>>>>>>>>    - The entire test suite has been passing
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> We are in the final review phase and nearly ready to merge. If
>>>>>>>>> anyone has any last-minute feedback, this is the final opportunity for
>>>>>>>>> review.
>>>>>>>>>
>>>>>>>>> Thank you!
>>>>>>>>> Andy Tolbert, Chris Lohfink, Francisco Guerrero, Kristijonas
>>>>>>>>> Zalys, and Jaydeep
>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>> --
>>>>>> -Dave
>>>>>>
>>>>>> David A. Herrington II
>>>>>> President and Chief Engineer
>>>>>> RhinoSource, Inc.
>>>>>>
>>>>>> *Data Lake Architecture, Cloud Computing and Advanced Analytics.*
>>>>>>
>>>>>> www.rhinosource.com
>>>>>>
>>>>>

Re: [UPDATE] CEP-37

Reply via email to