Tyler Hobbs created CASSANDRA-11891:
---------------------------------------
Summary: WriteTimeout during commit log replay due to MV lock
Key: CASSANDRA-11891
URL: https://issues.apache.org/jira/browse/CASSANDRA-11891
Project: Cassandra
Issue Type: Bug
Components: Lifecycle, Local Write-Read Paths
Reporter: Tyler Hobbs
Priority: Critical
Fix For: 3.0.x, 3.x
During commit log replay, if there are materialized views, it's possible for
contention on the MV lock to cause a {{WriteTimeoutException}}. This makes
commit log replay fail, which of course prevents the node from starting up.
This generally means that the operator has to move the commitlog segments to
avoid replay.
Here's a stacktrace of this happening on 3.0.5:
{noformat}
ERROR [main] 2016-05-25 15:10:31,120 CassandraDaemon.java:692 - Exception
encountered during startup
java.lang.RuntimeException: java.util.concurrent.ExecutionException:
org.apache.cassandra.exceptions.WriteTimeoutException: Operation timed out -
received only 0 responses.
at org.apache.cassandra.utils.Throwables.maybeFail(Throwables.java:50)
~[apache-cassandra-3.0.5.jar:3.0.5]
at
org.apache.cassandra.utils.FBUtilities.waitOnFutures(FBUtilities.java:372)
~[apache-cassandra-3.0.5.jar:3.0.5]
at
org.apache.cassandra.db.commitlog.CommitLogReplayer.replayMutation(CommitLogReplayer.java:624)
~[apache-cassandra-3.0.5.jar:3.0.5]
at
org.apache.cassandra.db.commitlog.CommitLogReplayer.replaySyncSection(CommitLogReplayer.java:511)
~[apache-cassandra-3.0.5.jar:3.0.5]
at
org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:406)
~[apache-cassandra-3.0.5.jar:3.0.5]
at
org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:153)
~[apache-cassandra-3.0.5.jar:3.0.5]
at
org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:189)
~[apache-cassandra-3.0.5.jar:3.0.5]
at
org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:169)
~[apache-cassandra-3.0.5.jar:3.0.5]
at
org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:283)
[apache-cassandra-3.0.5.jar:3.0.5]
at
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:551)
[apache-cassandra-3.0.5.jar:3.0.5]
at
org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:679)
[apache-cassandra-3.0.5.jar:3.0.5]
Caused by: java.util.concurrent.ExecutionException:
org.apache.cassandra.exceptions.WriteTimeoutException: Operation timed out -
received only 0 responses.
at
org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.get(AbstractLocalAwareExecutorService.java:200)
~[apache-cassandra-3.0.5.jar:3.0.5]
at
org.apache.cassandra.utils.FBUtilities.waitOnFutures(FBUtilities.java:365)
~[apache-cassandra-3.0.5.jar:3.0.5]
... 9 common frames omitted
Suppressed: java.util.concurrent.ExecutionException:
org.apache.cassandra.exceptions.WriteTimeoutException: Operation timed out -
received only 0 responses.
... 11 common frames omitted
Caused by: org.apache.cassandra.exceptions.WriteTimeoutException:
Operation timed out - received only 0 responses.
at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:431)
at
org.apache.cassandra.db.Keyspace.lambda$apply$62(Keyspace.java:443)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at
org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164)
at
org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105)
at java.lang.Thread.run(Thread.java:745)
{noformat}
We should ignore the {{write_rpc_timeout}} setting while acquiring MV locks if
we're on the commitlog replay path.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)