Paulo Motta created CASSANDRA-11839:
---------------------------------------

             Summary: Active streams fail with SocketTimeoutException
                 Key: CASSANDRA-11839
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11839
             Project: Cassandra
          Issue Type: Bug
            Reporter: Paulo Motta
            Assignee: Paulo Motta


The original reasoning behind {{streaming_socket_timeout_in_ms}} was to kill 
one-sided hanging streams (CASSANDRA-3838). This was never much of a problem 
when the default was zero (never timeout).

On CASSANDRA-8611 we changed the default to 1 hour, but it was never enforced 
due to CASSANDRA-11286, which was fixed recently.

On recent releases we've been receiving reports of stream failures when 
streaming large files, because the sender incoming socket becomes inactive, 
times out after 1 hour, and the stream session fails with 
{{SocketTimeoutException}} (CASSANDRA-11345, CASSANDRA-11826), even though the 
stream session is still active. The session also fails if 2i/MV rebuild takes 
longer than 1 hour on the receiver (CASSANDRA-8343).

The definitive fix on trunk is to add a {{KeepAlive}} message to the stream 
protocol to detect broken connections and retire 
{{streaming_socket_timeout_in_ms}}. But we must also increase the default 
{{streaming_socket_timeout_in_ms}} in older versions to a more conservative 
value, so it is still able to detect long hanging streams.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to