[jira] [Commented] (CASSANDRA-7560) 'nodetool repair -pr' leads to indefinitely hanging AntiEntropySession

Jeremiah Jordan (JIRA) Mon, 18 Aug 2014 08:12:13 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-7560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14100731#comment-14100731
 ]


Jeremiah Jordan commented on CASSANDRA-7560:
--------------------------------------------

[~yukim] ah. so run with -par and you can avoid that problem?

> 'nodetool repair -pr' leads to indefinitely hanging AntiEntropySession
> ----------------------------------------------------------------------
>
>                 Key: CASSANDRA-7560
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7560
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Vladimir Avram
>            Assignee: Yuki Morishita
>             Fix For: 2.0.10
>
>         Attachments: 0001-backport-CASSANDRA-6747.patch, 
> cassandra_daemon.log, cassandra_daemon_rep1.log, cassandra_daemon_rep2.log, 
> nodetool_command.log
>
>
> Running {{nodetool repair -pr}} will sometimes hang on one of the resulting 
> AntiEntropySessions.
> The system logs will show the repair command starting
> {noformat}
>  INFO [Thread-3079] 2014-07-15 02:22:56,514 StorageService.java (line 2569) 
> Starting repair command #1, repairing 256 ranges for keyspace x
> {noformat}
> You can then see a few AntiEntropySessions completing with:
> {noformat}
> INFO [AntiEntropySessions:2] 2014-07-15 02:28:12,766 RepairSession.java (line 
> 282) [repair #eefb3c30-0bc6-11e4-83f7-a378978d0c49] session completed 
> successfully
> {noformat}
> Finally we reach an AntiEntropySession at some point that hangs just before 
> requesting the merkle trees for the next column family in line for repair. So 
> we first see the previous CF being finished and the whole repair sessions 
> hangs here with no visible progress or errors on this or any of the related 
> nodes.
> {noformat}
> INFO [AntiEntropyStage:1] 2014-07-15 02:38:20,325 RepairSession.java (line 
> 221) [repair #8f85c1b0-0bc8-11e4-83f7-a378978d0c49] previous_cf is fully 
> synced
> {noformat}
> Notes:
> * Single DC 6 node cluster with an average load of 86 GB per node.
> * This appears to be random; it does not always happen on the same CF or on 
> the same session.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-7560) 'nodetool repair -pr' leads to indefinitely hanging AntiEntropySession

Reply via email to