I agree we need to do a better job and wording this so people can
understand what is happening.

For your exact example here, you are actually looking at too broad of a
thing.  The exact requirements are not at the full cluster level, but
actually at the “token range” level at which repair operates, a given token
range needs to have repair start and complete within the gc_grace sliding
window.  For your example of a repair cycle that takes 5 days, and is
started every 7 days, assuming you are performing that cycles in the same
order around the nodes every time, a given node will have been repaired
within 7 days, even though the start of repair 1 to the finish of repair 2
was more than 7 days.  The start of “token ranges repaired on day 0” to the
finish of “token ranges repaired on day 7” is less than the gc_grace window.

-Jeremiah Jordan

On May 16, 2025 at 2:03:00 PM, Mike Sun <m...@msun.io> wrote:

> The wording is subtle and can be confusing...
>
> It's important to distinguish between:
> 1. "You need to start and complete a repair within any gc_grace_seconds
> window"
> 2. "You need to start and complete a repair within gc_grace_seconds"
>
> #1 is a sliding time window for any time interval in which the tombstone
> (tombstone_created_time  is written and the expiration of
> it (tombstoned_created_time + gc_grace_seconds)
>
> #2 is a duration bound for the repair time
>
> My post is saying that to ensure the #1 requirement, you actually need to
> "start and complete two consecutive repairs within gc_grace_seconds"
>
>
> On Fri, May 16, 2025 at 2:49 PM Mike Sun <m...@msun.io> wrote:
>
>> > You need to *start and complete* a repair within any gc_grace_seconds
>> window.
>> Exactly this. And since "any gc_grace_seconds" does not mean "any
>> gc_grace_window from which a repair starts"... the requirement needs to be
>> that the duration to "start and complete" two consecutive full repairs is
>> within gc_grace_seconds"... that will ensure a repair "starts and
>> completes" within "any gc_grace_seconds" window
>>
>>
>>
>> On Fri, May 16, 2025 at 2:43 PM Mick Semb Wever <m...@apache.org> wrote:
>>
>>>     .
>>>
>>>
>>>> e.g., assume gc_grace_seconds=10 days, a repair takes 5 days to run
>>>> * Day 0: Repair 1 starts and processes token A
>>>> * Day 1: Token A is deleted resulting in Tombstone A that will expire
>>>> on Day 11
>>>> * Day 5: Repair 1 completes
>>>> * Day 7: Repair 2 starts
>>>> * Day 11: Tombstone A expires without being repaired
>>>> * Day 12: Repair 2 repairs Token A and completes
>>>>
>>>
>>>
>>> You need to *start and complete* a repair within any gc_grace_seconds
>>> window.
>>> In your example no repair started and completed in the Day 1-11 window.
>>>
>>> We do need to word this better, thanks for pointing it out Mike.
>>>
>>

Reply via email to