Tyler Hobbs created CASSANDRA-11891:
---------------------------------------

             Summary: WriteTimeout during commit log replay due to MV lock
                 Key: CASSANDRA-11891
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11891
             Project: Cassandra
          Issue Type: Bug
          Components: Lifecycle, Local Write-Read Paths
            Reporter: Tyler Hobbs
            Priority: Critical
             Fix For: 3.0.x, 3.x


During commit log replay, if there are materialized views, it's possible for 
contention on the MV lock to cause a {{WriteTimeoutException}}.  This makes 
commit log replay fail, which of course prevents the node from starting up.  
This generally means that the operator has to move the commitlog segments to 
avoid replay.

Here's a stacktrace of this happening on 3.0.5:

{noformat}
ERROR [main] 2016-05-25 15:10:31,120 CassandraDaemon.java:692 - Exception 
encountered during startup
java.lang.RuntimeException: java.util.concurrent.ExecutionException: 
org.apache.cassandra.exceptions.WriteTimeoutException: Operation timed out - 
received only 0 responses.
        at org.apache.cassandra.utils.Throwables.maybeFail(Throwables.java:50) 
~[apache-cassandra-3.0.5.jar:3.0.5]
        at 
org.apache.cassandra.utils.FBUtilities.waitOnFutures(FBUtilities.java:372) 
~[apache-cassandra-3.0.5.jar:3.0.5]
        at 
org.apache.cassandra.db.commitlog.CommitLogReplayer.replayMutation(CommitLogReplayer.java:624)
 ~[apache-cassandra-3.0.5.jar:3.0.5]
        at 
org.apache.cassandra.db.commitlog.CommitLogReplayer.replaySyncSection(CommitLogReplayer.java:511)
 ~[apache-cassandra-3.0.5.jar:3.0.5]
        at 
org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:406)
 ~[apache-cassandra-3.0.5.jar:3.0.5]
        at 
org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:153)
 ~[apache-cassandra-3.0.5.jar:3.0.5]
        at 
org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:189) 
~[apache-cassandra-3.0.5.jar:3.0.5]
        at 
org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:169) 
~[apache-cassandra-3.0.5.jar:3.0.5]
        at 
org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:283) 
[apache-cassandra-3.0.5.jar:3.0.5]
        at 
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:551) 
[apache-cassandra-3.0.5.jar:3.0.5]
        at 
org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:679) 
[apache-cassandra-3.0.5.jar:3.0.5]
Caused by: java.util.concurrent.ExecutionException: 
org.apache.cassandra.exceptions.WriteTimeoutException: Operation timed out - 
received only 0 responses.
        at 
org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.get(AbstractLocalAwareExecutorService.java:200)
 ~[apache-cassandra-3.0.5.jar:3.0.5]
        at 
org.apache.cassandra.utils.FBUtilities.waitOnFutures(FBUtilities.java:365) 
~[apache-cassandra-3.0.5.jar:3.0.5]
        ... 9 common frames omitted
        Suppressed: java.util.concurrent.ExecutionException: 
org.apache.cassandra.exceptions.WriteTimeoutException: Operation timed out - 
received only 0 responses.
                ... 11 common frames omitted
        Caused by: org.apache.cassandra.exceptions.WriteTimeoutException: 
Operation timed out - received only 0 responses.
                at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:431)
                at 
org.apache.cassandra.db.Keyspace.lambda$apply$62(Keyspace.java:443)
                at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
                at 
org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164)
                at 
org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105)
                at java.lang.Thread.run(Thread.java:745)
{noformat}

We should ignore the {{write_rpc_timeout}} setting while acquiring MV locks if 
we're on the commitlog replay path.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to