Hi Kristijonas,

It is not possible to run two repairs, regardless whether they are incremental or full, for the same token range and on the same table concurrently. You have two options:

1. create a schedule that's don't overlap, e.g. run incremental repair daily except the 1st of each month, and run full repair on the 1st of each month. If you choose to do this, make sure you setup a monitor and alert system for it and have someone respond to the alerts in weekends or public holidays. If a repair took longer than usual and is at the risk of overlapping with the next repair, a timely human intervention is required to prevent that - either kill the currently running repair or skip the next one.

2. use an orchestration tool, such as Cassandra Reaper, to take care of that for you. You will still need monitor and alert to ensure the repairs are run successfully, but fixing a stuck or failed repair is not very time sensitive, you can usually leave it till Monday morning if it happens at Friday night.

Personally I would recommend the 2nd option, because getting back to your laptop at 10 pm on Friday night after you have had a few beers is not fun.

Cheers,
Bowen

On 03/02/2024 01:59, Kristijonas Zalys wrote:
Hi Bowen,

Thank you for your help!

So given that we would need to run both incremental and full repair for a given cluster, is it safe to have both types of repair running for the same token ranges at the same time? Would it not create a race condition?

Thanks,
Kristijonas

On Fri, Feb 2, 2024 at 3:36 PM Bowen Song via user <user@cassandra.apache.org> wrote:

    Hi Kristijonas,

    To answer your questions:

    1. It's still necessary to run full repair on a cluster on which
    incremental repair is run periodically. The frequency of full
    repair is more of an art than science. Generally speaking, the
    less reliable the storage media, the more frequently full repair
    should be run. The documentation on this topic is available here
    
<https://cassandra.apache.org/doc/stable/cassandra/operating/repair.html#incremental-and-full-repairs>

    2. Run incremental repair for the first time on an existing
    cluster does cause Cassandra to re-compact all SSTables, and can
    lead to disk usage spikes. This can be avoided by following the
    steps mentioned here
    
<https://docs.datastax.com/en/cassandra-oss/3.0/cassandra/operations/opsRepairNodesMigration.html>


    I hope that helps.

    Cheers,
    Bowen

    On 02/02/2024 20:57, Kristijonas Zalys wrote:

    Hi folks,


    I am working on switching from full to incremental repair in
    Cassandra v4.0.6 (soon to be v4.1.3) and I have a few questions.


    1.

        Is it necessary to run regular full repair on a cluster if I
        already run incremental repair? If yes, what frequency would
        you recommend for full repair?

    2.

        Has anyone experienced disk usage spikes while using
        incremental repair? I have noticed temporary disk footprint
        increases of up to 2x (from ~15 GiB to ~30 GiB) caused by
        anti-compaction while testing and am wondering how likely
        that is to happen in bigger real world use cases?


    Thank you all in advance!

    Kristijonas

Reply via email to