[
https://issues.apache.org/jira/browse/CASSANDRA-10992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15094989#comment-15094989
]
Paulo Motta commented on CASSANDRA-10992:
-----------------------------------------
I don't know exactly what's happening, but the {{AsynchronousCloseException}}
makes it smell like the interrupt workaround for CASSANDRA-10012 is closing the
channel after a genuine timeout, preventing a retry. This was fixed on
CASSANDRA-10961, so to test that hypothesis, could you try replacing the jar I
attached (which contains the 2.1 revert for CASSANDRA-10012) in a subset of the
nodes involved in the repair? A rolling restart will be needed. If this does
not solve the issue, please attach corresponding trace logs as instructed
before (making sure to enable trace logs in the logback configuration before
triggering the faulty repair operation after replacing the jars).
> Hanging streaming sessions
> --------------------------
>
> Key: CASSANDRA-10992
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10992
> Project: Cassandra
> Issue Type: Bug
> Environment: C* 2.1.12, Debian Wheezy
> Reporter: mlowicki
> Assignee: Paulo Motta
> Fix For: 2.1.12
>
> Attachments: apache-cassandra-2.1.12-SNAPSHOT.jar
>
>
> I've started recently running repair using [Cassandra
> Reaper|https://github.com/spotify/cassandra-reaper] (built-in {{nodetool
> repair}} doesn't work for me - CASSANDRA-9935). It behaves fine but I've
> noticed hanging streaming sessions:
> {code}
> root@db1:~# date
> Sat Jan 9 16:43:00 UTC 2016
> root@db1:~# nt netstats -H | grep total
> Receiving 5 files, 46.59 MB total. Already received 1 files, 11.32 MB
> total
> Sending 7 files, 46.28 MB total. Already sent 7 files, 46.28 MB total
> Receiving 6 files, 64.15 MB total. Already received 1 files, 12.14 MB
> total
> Sending 5 files, 61.15 MB total. Already sent 5 files, 61.15 MB total
> Receiving 4 files, 7.75 MB total. Already received 3 files, 7.58 MB
> total
> Sending 4 files, 4.29 MB total. Already sent 4 files, 4.29 MB total
> Receiving 12 files, 13.79 MB total. Already received 11 files, 7.66
> MB total
> Sending 5 files, 15.32 MB total. Already sent 5 files, 15.32 MB total
> Receiving 8 files, 20.35 MB total. Already received 1 files, 13.63 MB
> total
> Sending 38 files, 125.34 MB total. Already sent 38 files, 125.34 MB
> total
> root@db1:~# date
> Sat Jan 9 17:45:42 UTC 2016
> root@db1:~# nt netstats -H | grep total
> Receiving 5 files, 46.59 MB total. Already received 1 files, 11.32 MB
> total
> Sending 7 files, 46.28 MB total. Already sent 7 files, 46.28 MB total
> Receiving 6 files, 64.15 MB total. Already received 1 files, 12.14 MB
> total
> Sending 5 files, 61.15 MB total. Already sent 5 files, 61.15 MB total
> Receiving 4 files, 7.75 MB total. Already received 3 files, 7.58 MB
> total
> Sending 4 files, 4.29 MB total. Already sent 4 files, 4.29 MB total
> Receiving 12 files, 13.79 MB total. Already received 11 files, 7.66
> MB total
> Sending 5 files, 15.32 MB total. Already sent 5 files, 15.32 MB total
> Receiving 8 files, 20.35 MB total. Already received 1 files, 13.63 MB
> total
> Sending 38 files, 125.34 MB total. Already sent 38 files, 125.34 MB
> total
> {code}
> Such sessions are left even when repair job is long time done (confirmed by
> checking Reaper's and Cassandra's logs). {{streaming_socket_timeout_in_ms}}
> in cassandra.yaml is set to default value (3600000).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)