[ 
https://issues.apache.org/jira/browse/CASSANDRA-8343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15144718#comment-15144718
 ] 

Paulo Motta commented on CASSANDRA-8343:
----------------------------------------

Surprisingly enough I didn't manage to reproduce this issue in 2.1 because the 
{{streaming_socket_timeout}} parameter was not being enforced due to the use of 
a {{ReadableByteChannel}} created via {{socket.getChannel()}}, which never 
times out on reads (see [this 
article|https://technfun.wordpress.com/2009/01/29/networking-in-java-non-blocking-nio-blocking-nio-and-io/]
 for background). The workaround is to create the {{ReadableByteChannel}} via 
{{Channels.newChannel(socket.getInputStream())}} instead, so the socket 
{{SO_TIMEOUT}} is respected.

Even after this fix, the socket {{SO_TIMEOUT}} was never being set on the 
receiving side, so I also set while attaching the socket on the receiving side.

After the previous fixes, I managed to reproduce this issue on a [bootstrap 
dtest|https://github.com/pauloricardomg/cassandra-dtest/commit/301e332758b3873d2bb61259343375107caf437b]
 by introducing a sleep delay (via a system property) on the 
{{OnCompletionRunnable}} larger than {{streaming_socket_timeout}}.

This problem will probably happen more often on 3.0 because of MVs, since 
they're rebuilt by the receiving node in the end of the stream session.
I think we should remain finishing the stream session only after the secondary 
indexes/MVs are rebuilt to avoid leaving the node in a inconsistent state in 
case the rebuild fails after the session is completed.

The proposed solution is to introduce a {{KeepAlive}} message and send a keep 
alive message to the peer after reaching the {{WAIT_COMPLETE}} state every 
{{streaming_socket_timeout/2}}, to ensure the socket will remain fresh and will 
not throw a {{SocketTimeoutException}} and fail the stream session.

I initially created a fix for 2.1 (even though it's near EOL, I think 
{{streaming_socket_timeout}} not working is critical enough to be fixed on 
2.1), and after review I will create patch for other versions.

||2.1||dtest||
|[branch|https://github.com/apache/cassandra/compare/cassandra-2.1...pauloricardomg:2.1-8343]|[branch|https://github.com/riptano/cassandra-dtest/compare/master...pauloricardomg:8343]|
|[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-2.1-8343-testall/lastCompletedBuild/testReport/]|
|[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-2.1-8343-dtest/lastCompletedBuild/testReport/]|

> Secondary index creation causes moves/bootstraps to fail
> --------------------------------------------------------
>
>                 Key: CASSANDRA-8343
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8343
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Michael Frisch
>            Assignee: Paulo Motta
>
> Node moves/bootstraps are failing if the stream timeout is set to a value in 
> which secondary index creation cannot complete.  This happens because at the 
> end of the very last stream the StreamInSession.closeIfFinished() function 
> calls maybeBuildSecondaryIndexes on every column family.  If the stream time 
> + all CF's index creation takes longer than your stream timeout then the 
> socket closes from the sender's side, the receiver of the stream tries to 
> write to said socket because it's not null, an IOException is thrown but not 
> caught in closeIfFinished(), the exception is caught somewhere and not 
> logged, AbstractStreamSession.close() is never called, and the CountDownLatch 
> is never decremented.  This causes the move/bootstrap to continue forever 
> until the node is restarted.
> This problem of stream time + secondary index creation time exists on 
> decommissioning/unbootstrap as well but since it's on the sending side the 
> timeout triggers the onFailure() callback which does decrement the 
> CountDownLatch leading to completion.
> A cursory glance at the 2.0 code leads me to believe this problem would exist 
> there as well.
> Temporary workaround: set a really high/infinite stream timeout.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to