[
https://issues.apache.org/jira/browse/CASSANDRA-4578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jonathan Ellis updated CASSANDRA-4578:
--------------------------------------
Priority: Minor (was: Major)
Affects Version/s: (was: 1.1.3)
0.8.0
Fix Version/s: 1.1.5
Assignee: Sylvain Lebresne
You're right, since CMVH grabs a writer thread until it gets replies from the
other replicas, you can have two replicas deadlock with A waiting for a reply
from B, and B waiting for a reply from A.
One fix would be to move the local write into CMVH and the remote part into a
separate stage (or maybe just a custom callback).
As a workaround, use CL.ONE with counters.
> Dead lock in mutation stage when many concurrent writes to few columns
> ----------------------------------------------------------------------
>
> Key: CASSANDRA-4578
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4578
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Affects Versions: 0.8.0
> Environment: 15 cassandra instances
> CentOS5
> 8 Core 64GB Memory
> java version "1.6.0_33"
> Java(TM) SE Runtime Environment (build 1.6.0_33-b04)
> Java HotSpot(TM) 64-Bit Server VM (build 20.8-b03, mixed mode)
> Reporter: Suguru Namura
> Assignee: Sylvain Lebresne
> Priority: Minor
> Fix For: 1.1.5
>
> Attachments: threaddump-1344957574788.tdump
>
>
> When I send many request to increment counters to few counter columns,
> sometimes mutation stage cause dead lock. When it happened, all of mutation
> threads are locked and do not accept updates any more.
> {noformat}
> "MutationStage:432" - Thread t@1389
> java.lang.Thread.State: TIMED_WAITING
> at java.lang.Object.wait(Native Method)
> - waiting on <b90b45b> (a org.apache.cassandra.utils.SimpleCondition)
> at java.lang.Object.wait(Object.java:443)
> at java.util.concurrent.TimeUnit.timedWait(TimeUnit.java:292)
> at
> org.apache.cassandra.utils.SimpleCondition.await(SimpleCondition.java:54)
> at
> org.apache.cassandra.service.AbstractWriteResponseHandler.get(AbstractWriteResponseHandler.java:55)
> at
> org.apache.cassandra.db.CounterMutationVerbHandler.doVerb(CounterMutationVerbHandler.java:51)
> at
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> Locked ownable synchronizers:
> - locked <4b1b0a6f> (a
> java.util.concurrent.locks.ReentrantLock$NonfairSync)
> {noformat}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira