[
https://issues.apache.org/jira/browse/CASSANDRA-11841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15300957#comment-15300957
]
Paulo Motta commented on CASSANDRA-11841:
-----------------------------------------
Basic idea is to replace {{streaming_socket_timeout_in_ms}} with a new property
{{streaming_keep_alive_period_in_ms}}, with default period of 5 minutes.
The incoming socket timeout is set to {{2 * streaming_keep_alive_period_in_ms}}
so if any of the peers does not receive any data for 2 keep-alive rounds (10
minutes with default settings), the stream session fails with
{{SocketTimeoutException}}.
Each stream peer keeps a scheduled task with period of
{{streaming_keep_alive_period_in_ms}} for the duration of the stream session
sending a new {{KeepAlive}} message to the other peer. The task is intelligent
enough to avoid sending a new keep-alive message if the previous was not yet
sent, to avoid accumulating keep-alive messages while the node is active
streaming a large file.
The feature is only enabled if the peer is on version >= 3.8, so stream
protocol remains backward compatible. Otherwise it just falls back to
{{streaming_socket_timeout_in_ms}} (that's why we must keep it as a hidden
property until the next stream protocol version bump).
I added [dtests|https://github.com/pauloricardomg/cassandra-dtest/tree/11841]
to check that the stream session remains active if the transfer of a single
file takes longer than {{streaming_keep_alive_period_in_ms}} for bootstrap and
replace_address. I also added an mixed version test to check the feature is not
enabled when streaming with a peer with version < 3.8.
Patch and tests available below:
||trunk||dtest||
|[branch|https://github.com/apache/cassandra/compare/trunk...pauloricardomg:11841-trunk]|[branch|https://github.com/riptano/cassandra-dtest/compare/master...pauloricardomg:11841]|
|[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-11841-trunk-testall/lastCompletedBuild/testReport/]|
|[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-11841-trunk-dtest/lastCompletedBuild/testReport/]|
ps: this is built on top of CASSANDRA-11840.
> Add keep-alive to stream protocol
> ---------------------------------
>
> Key: CASSANDRA-11841
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11841
> Project: Cassandra
> Issue Type: Sub-task
> Reporter: Paulo Motta
> Assignee: Paulo Motta
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)