[
https://issues.apache.org/jira/browse/CASSANDRA-8343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15144718#comment-15144718
]
Paulo Motta commented on CASSANDRA-8343:
----------------------------------------
Surprisingly enough I didn't manage to reproduce this issue in 2.1 because the
{{streaming_socket_timeout}} parameter was not being enforced due to the use of
a {{ReadableByteChannel}} created via {{socket.getChannel()}}, which never
times out on reads (see [this
article|https://technfun.wordpress.com/2009/01/29/networking-in-java-non-blocking-nio-blocking-nio-and-io/]
for background). The workaround is to create the {{ReadableByteChannel}} via
{{Channels.newChannel(socket.getInputStream())}} instead, so the socket
{{SO_TIMEOUT}} is respected.
Even after this fix, the socket {{SO_TIMEOUT}} was never being set on the
receiving side, so I also set while attaching the socket on the receiving side.
After the previous fixes, I managed to reproduce this issue on a [bootstrap
dtest|https://github.com/pauloricardomg/cassandra-dtest/commit/301e332758b3873d2bb61259343375107caf437b]
by introducing a sleep delay (via a system property) on the
{{OnCompletionRunnable}} larger than {{streaming_socket_timeout}}.
This problem will probably happen more often on 3.0 because of MVs, since
they're rebuilt by the receiving node in the end of the stream session.
I think we should remain finishing the stream session only after the secondary
indexes/MVs are rebuilt to avoid leaving the node in a inconsistent state in
case the rebuild fails after the session is completed.
The proposed solution is to introduce a {{KeepAlive}} message and send a keep
alive message to the peer after reaching the {{WAIT_COMPLETE}} state every
{{streaming_socket_timeout/2}}, to ensure the socket will remain fresh and will
not throw a {{SocketTimeoutException}} and fail the stream session.
I initially created a fix for 2.1 (even though it's near EOL, I think
{{streaming_socket_timeout}} not working is critical enough to be fixed on
2.1), and after review I will create patch for other versions.
||2.1||dtest||
|[branch|https://github.com/apache/cassandra/compare/cassandra-2.1...pauloricardomg:2.1-8343]|[branch|https://github.com/riptano/cassandra-dtest/compare/master...pauloricardomg:8343]|
|[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-2.1-8343-testall/lastCompletedBuild/testReport/]|
|[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-2.1-8343-dtest/lastCompletedBuild/testReport/]|
> Secondary index creation causes moves/bootstraps to fail
> --------------------------------------------------------
>
> Key: CASSANDRA-8343
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8343
> Project: Cassandra
> Issue Type: Bug
> Reporter: Michael Frisch
> Assignee: Paulo Motta
>
> Node moves/bootstraps are failing if the stream timeout is set to a value in
> which secondary index creation cannot complete. This happens because at the
> end of the very last stream the StreamInSession.closeIfFinished() function
> calls maybeBuildSecondaryIndexes on every column family. If the stream time
> + all CF's index creation takes longer than your stream timeout then the
> socket closes from the sender's side, the receiver of the stream tries to
> write to said socket because it's not null, an IOException is thrown but not
> caught in closeIfFinished(), the exception is caught somewhere and not
> logged, AbstractStreamSession.close() is never called, and the CountDownLatch
> is never decremented. This causes the move/bootstrap to continue forever
> until the node is restarted.
> This problem of stream time + secondary index creation time exists on
> decommissioning/unbootstrap as well but since it's on the sending side the
> timeout triggers the onFailure() callback which does decrement the
> CountDownLatch leading to completion.
> A cursory glance at the 2.0 code leads me to believe this problem would exist
> there as well.
> Temporary workaround: set a really high/infinite stream timeout.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)