Michael Frisch created CASSANDRA-8343:
-----------------------------------------
Summary: Secondary index creation causes moves/bootstraps to fail
Key: CASSANDRA-8343
URL: https://issues.apache.org/jira/browse/CASSANDRA-8343
Project: Cassandra
Issue Type: Bug
Reporter: Michael Frisch
Node moves/bootstraps are failing if the stream timeout is set to a value in
which secondary index creation cannot complete. This happens because at the
end of the very last stream the StreamInSession.closeIfFinished() function
calls maybeBuildSecondaryIndexes on every column family. If the stream time +
all CF's index creation takes longer than your stream timeout then the socket
closes from the sender's side, the receiver of the stream tries to write to
said socket because it's not null, an IOException is thrown but not caught in
closeIfFinished(), the exception is caught somewhere and not logged,
AbstractStreamSession.close() is never called, and the CountDownLatch is never
decremented. This causes the move/bootstrap to continue forever until the node
is restarted.
This problem of stream time + secondary index creation time exists on
decommissioning/unbootstrap as well but since it's on the sending side the
timeout triggers the onFailure() callback which does decrement the
CountDownLatch leading to completion.
A cursory glance at the 2.0 code leads me to believe this problem would exist
there as well.
Temporary workaround: set a really high/infinite stream timeout.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)