[
https://issues.apache.org/jira/browse/CASSANDRA-3316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Yuki Morishita updated CASSANDRA-3316:
--------------------------------------
Attachment: 3316-v1.txt
First attempt. Added JMX interface (forceTerminateAllRepairSessions) to ss.
I think it would be better if there is nodetool cmd for this feature. How about
nodetool cleanuprepair?
> Add a JMX call to force cleaning repair sessions (in case they are hang up)
> ---------------------------------------------------------------------------
>
> Key: CASSANDRA-3316
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3316
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Affects Versions: 0.8.6
> Reporter: Sylvain Lebresne
> Assignee: Yuki Morishita
> Priority: Minor
> Fix For: 1.0.2
>
> Attachments: 3316-v1.txt
>
>
> A repair session contains many parts, most of which are not local to the node
> (implying the node waits on those operation). You request merkle trees, then
> you schedule streaming (and in 1.0.0, some of the streaming don't involve the
> local node itself). It's lots of place where something can go wrong, and if
> so it leaves the repair hanging and as a consequence it leaves a
> repairSessions tasks sitting active on the 'AntiEntropy Session' executor.
> Obviously, we should improve the detection by repair of those things that can
> go wrong. CASSANDRA-2433 started and CASSANDRA-3112 is open to fill as much
> of the remaining parts as possible, but my bet is that it will be hard to
> cover everything (and it may not be worth of handling very improbable failure
> scenario). Besides CASSANDRA-3112 will involve change in the wire protocol,
> so it may take some time to be committed. In the meantime, it would be nice
> to provide a JMX call to force terminating repairSessions so that you don't
> end up in the case where you have enough 'zombie' sessions on the executor
> that you can't submit new ones (you could restart the node but it's ugly).
> Anyway, it's not a big issue but it would be simple to add such a JMX call.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira