[
https://issues.apache.org/jira/browse/CASSANDRA-17955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Stefan Miklosovic updated CASSANDRA-17955:
------------------------------------------
Test and Documentation Plan: ci
Status: Patch Available (was: In Progress)
This might be the solution (1). Basically, we need to make sure that a new
snapshot is not taken until snapshots are cleared. Executor in
ActiveRepairService is running snapshot cleanup in a non-blocking way. That
executor can run 1 thread only at any given time.
CassandraTableRepairManager takes an emphemeral snapshot and it might race in
ActiveRepairService as a snapshot is being cleared but it expects the directory
to be empty - but it is not, because CassandraTableRepairManager created a
snapshot in it.
(1) https://github.com/apache/cassandra/pull/1903/files
> Race condition on repair snapshots
> ----------------------------------
>
> Key: CASSANDRA-17955
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17955
> Project: Cassandra
> Issue Type: Bug
> Components: Consistency/Repair, Local/Snapshots
> Reporter: Cameron Zemek
> Assignee: Stefan Miklosovic
> Priority: Normal
> Labels: 4.0
> Fix For: 4.0.x, 4.1-rc, 4.x
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> If an endpoint is convicted and that endpoint is a coordinator then
> ActiveRepairService::removeParentRepairSession is called.
> The issue is that this occurs on clearSnapshotExecutor and can happen while
> RepairMessageVerbHandler is in process of taking a snapshot. So then you get
> a race condition and clearSnapshot will throw a
> java.nio.file.DirectoryNotEmptyException
>
> {code:java}
> public static void deleteRecursiveWithThrottle(File dir, RateLimiter
> rateLimiter)
> {
> if (dir.isDirectory())
> {
> String[] children = dir.list();
> for (String child : children)
> deleteRecursiveWithThrottle(new File(dir, child), rateLimiter);
> }
> // The directory is now empty so now it can be smoked
> deleteWithConfirmWithThrottle(dir, rateLimiter);
> } {code}
> Due to the directory not being empty when it goes to remove the directory at
> the end.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]