[
https://issues.apache.org/jira/browse/CASSANDRA-10779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15064642#comment-15064642
]
Carl Yeksigian commented on CASSANDRA-10779:
--------------------------------------------
I think there is a different issue lurking here, which is that we are
responding to mutations early on the replicas if we don't acquire the lock
right away. We are returning early from the {{mutation.apply}} call because we
were unable to acquire the lock, but in the {{MutationVerbHandler}}, we're only
don't reply in the case we get a WTE (which only happens in the MV path).
The reason we changed this to put itself back on the queue when it was unable
to acquire the lock was that we would otherwise be blocking other mutations
that are waiting, especially the coordinator batchlog mutations. Since we
aren't doing the coordinator batchlog, it might be best to just switch back to
waiting for the lock synchronously. The other option would be to include a
callback function with the mutation to run on success/failure.
I'm currently testing the impact of changing this to acquiring the lock
synchronously; I'll also look into the amount of change required to add
callback functions.
> AbstractTracingAwareExecutorService.java:169 - Uncaught exception on thread
> Thread
> ----------------------------------------------------------------------------------
>
> Key: CASSANDRA-10779
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10779
> Project: Cassandra
> Issue Type: Bug
> Components: Coordination
> Environment: Windows 7 64-bit, Cassandra v3.0.0, Java 1.8u60
> Reporter: Will Zhang
> Assignee: Carl Yeksigian
> Fix For: 3.0.x
>
>
> Hi guys,
> I encountered the following warning message when I was testing to upgrade
> from v2.2.2 to v3.0.0.
> It looks like a write time-out but in an uncaught exception. Could this be an
> easy fix?
> Log file section below. Thank you!
> {code}
> WARN [SharedPool-Worker-64] 2015-11-26 14:04:24,678
> AbstractTracingAwareExecutorService.java:169 - Uncaught exception on thread
> Thread[SharedPool-Worker-64,10,main]: {}
> org.apache.cassandra.exceptions.WriteTimeoutException: Operation timed out -
> received only 0 responses.
> at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:427)
> ~[apache-cassandra-3.0.0.jar:3.0.0]
> at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:386)
> ~[apache-cassandra-3.0.0.jar:3.0.0]
> at org.apache.cassandra.db.Mutation.apply(Mutation.java:205)
> ~[apache-cassandra-3.0.0.jar:3.0.0]
> at
> org.apache.cassandra.db.Keyspace.lambda$apply$59(Keyspace.java:435)
> ~[apache-cassandra-3.0.0.jar:3.0.0]
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> ~[na:1.8.0_60]
> at
> org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164)
> ~[apache-cassandra-3.0.0.jar:3.0.0]
> at
> org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105)
> [apache-cassandra-3.0.0.jar:3.0.0]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_60]
> INFO [IndexSummaryManager:1] 2015-11-26 14:41:10,527
> IndexSummaryManager.java:257 - Redistributing index summaries
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)