Re: Switching to Incremental Repair

Bowen Song via user Sat, 03 Feb 2024 07:23:57 -0800

Hi Kristijonas,

It is not possible to run two repairs, regardless whether they areincremental or full, for the same token range and on the same tableconcurrently. You have two options:

1. create a schedule that's don't overlap, e.g. run incremental repairdaily except the 1st of each month, and run full repair on the 1st ofeach month. If you choose to do this, make sure you setup a monitor andalert system for it and have someone respond to the alerts in weekendsor public holidays. If a repair took longer than usual and is at therisk of overlapping with the next repair, a timely human intervention isrequired to prevent that - either kill the currently running repair orskip the next one.

2. use an orchestration tool, such as Cassandra Reaper, to take care ofthat for you. You will still need monitor and alert to ensure therepairs are run successfully, but fixing a stuck or failed repair is notvery time sensitive, you can usually leave it till Monday morning if ithappens at Friday night.

Personally I would recommend the 2nd option, because getting back toyour laptop at 10 pm on Friday night after you have had a few beers isnot fun.


Cheers,
Bowen

On 03/02/2024 01:59, Kristijonas Zalys wrote:

Hi Bowen,

Thank you for your help!

So given that we would need to run both incremental and full repairfor a given cluster, is it safe to have both types of repair runningfor the same token ranges at the same time? Would it not create a racecondition?


Thanks,
Kristijonas

On Fri, Feb 2, 2024 at 3:36 PM Bowen Song via user<user@cassandra.apache.org> wrote:


    Hi Kristijonas,

    To answer your questions:

    1. It's still necessary to run full repair on a cluster on which
    incremental repair is run periodically. The frequency of full
    repair is more of an art than science. Generally speaking, the
    less reliable the storage media, the more frequently full repair
    should be run. The documentation on this topic is available here
    
<https://cassandra.apache.org/doc/stable/cassandra/operating/repair.html#incremental-and-full-repairs>

    2. Run incremental repair for the first time on an existing
    cluster does cause Cassandra to re-compact all SSTables, and can
    lead to disk usage spikes. This can be avoided by following the
    steps mentioned here
    
<https://docs.datastax.com/en/cassandra-oss/3.0/cassandra/operations/opsRepairNodesMigration.html>


    I hope that helps.

    Cheers,
    Bowen

    On 02/02/2024 20:57, Kristijonas Zalys wrote:


    Hi folks,


    I am working on switching from full to incremental repair in
    Cassandra v4.0.6 (soon to be v4.1.3) and I have a few questions.


    1.

        Is it necessary to run regular full repair on a cluster if I
        already run incremental repair? If yes, what frequency would
        you recommend for full repair?

    2.

        Has anyone experienced disk usage spikes while using
        incremental repair? I have noticed temporary disk footprint
        increases of up to 2x (from ~15 GiB to ~30 GiB) caused by
        anti-compaction while testing and am wondering how likely
        that is to happen in bigger real world use cases?


    Thank you all in advance!

    Kristijonas

Re: Switching to Incremental Repair

Reply via email to