Michael Frisch created CASSANDRA-8343:
-----------------------------------------

             Summary: Secondary index creation causes moves/bootstraps to fail
                 Key: CASSANDRA-8343
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8343
             Project: Cassandra
          Issue Type: Bug
            Reporter: Michael Frisch


Node moves/bootstraps are failing if the stream timeout is set to a value in 
which secondary index creation cannot complete.  This happens because at the 
end of the very last stream the StreamInSession.closeIfFinished() function 
calls maybeBuildSecondaryIndexes on every column family.  If the stream time + 
all CF's index creation takes longer than your stream timeout then the socket 
closes from the sender's side, the receiver of the stream tries to write to 
said socket because it's not null, an IOException is thrown but not caught in 
closeIfFinished(), the exception is caught somewhere and not logged, 
AbstractStreamSession.close() is never called, and the CountDownLatch is never 
decremented.  This causes the move/bootstrap to continue forever until the node 
is restarted.

This problem of stream time + secondary index creation time exists on 
decommissioning/unbootstrap as well but since it's on the sending side the 
timeout triggers the onFailure() callback which does decrement the 
CountDownLatch leading to completion.

A cursory glance at the 2.0 code leads me to believe this problem would exist 
there as well.

Temporary workaround: set a really high/infinite stream timeout.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to