[
https://issues.apache.org/jira/browse/CASSANDRA-13797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Kurt Greaves reopened CASSANDRA-13797:
--------------------------------------
> RepairJob blocks on syncTasks
> -----------------------------
>
> Key: CASSANDRA-13797
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13797
> Project: Cassandra
> Issue Type: Bug
> Components: Repair
> Reporter: Blake Eggleston
> Assignee: Blake Eggleston
> Priority: Major
> Fix For: 3.0.15, 3.11.1, 4.0
>
>
> The thread running {{RepairJob}} blocks while it waits for the validations it
> starts to complete ([see
> here|https://github.com/bdeggleston/cassandra/blob/9fdec0a82851f5c35cd21d02e8c4da8fc685edb2/src/java/org/apache/cassandra/repair/RepairJob.java#L185]).
> However, the downstream callbacks (ie: the post-repair cleanup stuff) aren't
> waiting for {{RepairJob#run}} to return, they're waiting for a result to be
> set on RepairJob the future, which happens after the sync tasks have
> completed. This post repair cleanup stuff also immediately shuts down the
> executor {{RepairJob#run}} is running in. So in noop repair sessions, where
> there's nothing to stream, I'm seeing the callbacks sometimes fire before
> {{RepairJob#run}} wakes up, and causing an {{InterruptedException}} is thrown.
> I'm pretty sure this can just be removed, but I'd like a second opinion. This
> appears to just be a holdover from before repair coordination became async. I
> thought it might be doing some throttling by blocking, but each repair
> session gets it's own executor, and validation is throttled by the fixed
> size executors doing the actual work of validation, so I don't think we need
> to keep this around.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]