[ 
https://issues.apache.org/jira/browse/CASSANDRA-8621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15191015#comment-15191015
 ] 

Paulo Motta commented on CASSANDRA-8621:
----------------------------------------

Given that the stalled stream issue that originated this ticket was likely 
caused by CASSANDRA-11286, and with that in place and a properly configured 
network (ie. smaller keepalive interval) connections  won't die if there is no 
network partition, I think this feature loses relevance, as it will add more 
state/complexity to the streaming protocol without clear benefits. So I propose 
we close this a later and re-evaluate if there are still broken connections 
after CASSANDRA-11286. WDYT [~yukim] ?

> For streaming operations, when a socket is closed/reset, we should 
> retry/reinitiate that stream
> -----------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-8621
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8621
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Streaming and Messaging
>            Reporter: Jeremy Hanna
>            Assignee: Paulo Motta
>
> Currently we have a setting (streaming_socket_timeout_in_ms) that will 
> timeout and retry the stream operation in the case where tcp is idle for a 
> period of time.  However in the case where the socket is closed or reset, we 
> do not retry the operation.  This can happen for a number of reasons, 
> including when a firewall sends a reset message on a socket during a 
> streaming operation, such as nodetool rebuild necessarily across DCs or 
> repairs.
> Doing a retry would make the streaming operations more resilient.  It would 
> be good to log the retry clearly as well (with the stream session ID and node 
> address).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to