[jira] [Updated] (CASSANDRA-17168) Don't block gossip when clearing snapshots for failing repairs

Marcus Eriksson (Jira) Wed, 24 Nov 2021 00:21:20 -0800


     [ 
https://issues.apache.org/jira/browse/CASSANDRA-17168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Marcus Eriksson updated CASSANDRA-17168:
----------------------------------------
     Bug Category: Parent values: Availability(12983)Level 1 values: 
Unavailable(12994)
       Complexity: Normal
      Component/s: Consistency/Repair
    Discovered By: Adhoc Test
    Fix Version/s: 4.0.x
                   4.x
        Reviewers: David Capwell
         Severity: Normal
           Status: Open  (was: Triage Needed)

trunk:
https://github.com/apache/cassandra/pull/1340
https://app.circleci.com/pipelines/github/krummas/cassandra?branch=marcuse%2F17168-trunk
4.0:
https://github.com/apache/cassandra/pull/1341 
https://app.circleci.com/pipelines/github/krummas/cassandra?branch=marcuse%2F17168

note that the trunk version includes a change to the PREPARE message to include 
repair parallelism instead of setting a flag on ParentRepairSession

> Don't block gossip when clearing snapshots for failing repairs
> --------------------------------------------------------------
>
>                 Key: CASSANDRA-17168
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-17168
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Consistency/Repair
>            Reporter: Marcus Eriksson
>            Assignee: Marcus Eriksson
>            Priority: Normal
>             Fix For: 4.0.x, 4.x
>
>
> We clear snapshots in the GossipTasks thread when a repair session fails due 
> to a replica shutting down. If there are many tables/repair sessions ongoing 
> this can take a long time. With enough tables being repaired at the same time 
> even checking if the snapshots exists can take long enough to mark nodes down.
> We should clear snapshots in a separate thread and add a flag to tell us 
> whether this repair session can have snapshots to avoid checking if the 
> directory exists.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (CASSANDRA-17168) Don't block gossip when clearing snapshots for failing repairs

Reply via email to