[jira] [Updated] (CASSANDRA-4578) Dead lock in mutation stage when many concurrent writes to few columns
[ https://issues.apache.org/jira/browse/CASSANDRA-4578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nate McCall updated CASSANDRA-4578: --- Attachment: 4578-1.0-backport.txt Against cassandra-1.0 latest. Only differs in line numbers, otherwise no issues. All tests pass. Dead lock in mutation stage when many concurrent writes to few columns -- Key: CASSANDRA-4578 URL: https://issues.apache.org/jira/browse/CASSANDRA-4578 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 0.8.0 Environment: 15 cassandra instances CentOS5 8 Core 64GB Memory java version 1.6.0_33 Java(TM) SE Runtime Environment (build 1.6.0_33-b04) Java HotSpot(TM) 64-Bit Server VM (build 20.8-b03, mixed mode) Reporter: Suguru Namura Assignee: Sylvain Lebresne Priority: Minor Fix For: 1.1.5 Attachments: 4578-1.0-backport.txt, 4578.txt, threaddump-1344957574788.tdump When I send many request to increment counters to few counter columns, sometimes mutation stage cause dead lock. When it happened, all of mutation threads are locked and do not accept updates any more. {noformat} MutationStage:432 - Thread t@1389 java.lang.Thread.State: TIMED_WAITING at java.lang.Object.wait(Native Method) - waiting on b90b45b (a org.apache.cassandra.utils.SimpleCondition) at java.lang.Object.wait(Object.java:443) at java.util.concurrent.TimeUnit.timedWait(TimeUnit.java:292) at org.apache.cassandra.utils.SimpleCondition.await(SimpleCondition.java:54) at org.apache.cassandra.service.AbstractWriteResponseHandler.get(AbstractWriteResponseHandler.java:55) at org.apache.cassandra.db.CounterMutationVerbHandler.doVerb(CounterMutationVerbHandler.java:51) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Locked ownable synchronizers: - locked 4b1b0a6f (a java.util.concurrent.locks.ReentrantLock$NonfairSync) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-4578) Dead lock in mutation stage when many concurrent writes to few columns
[ https://issues.apache.org/jira/browse/CASSANDRA-4578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-4578: Fix Version/s: (was: 1.1.6) 1.1.5 Dead lock in mutation stage when many concurrent writes to few columns -- Key: CASSANDRA-4578 URL: https://issues.apache.org/jira/browse/CASSANDRA-4578 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 0.8.0 Environment: 15 cassandra instances CentOS5 8 Core 64GB Memory java version 1.6.0_33 Java(TM) SE Runtime Environment (build 1.6.0_33-b04) Java HotSpot(TM) 64-Bit Server VM (build 20.8-b03, mixed mode) Reporter: Suguru Namura Assignee: Sylvain Lebresne Priority: Minor Fix For: 1.1.5 Attachments: 4578.txt, threaddump-1344957574788.tdump When I send many request to increment counters to few counter columns, sometimes mutation stage cause dead lock. When it happened, all of mutation threads are locked and do not accept updates any more. {noformat} MutationStage:432 - Thread t@1389 java.lang.Thread.State: TIMED_WAITING at java.lang.Object.wait(Native Method) - waiting on b90b45b (a org.apache.cassandra.utils.SimpleCondition) at java.lang.Object.wait(Object.java:443) at java.util.concurrent.TimeUnit.timedWait(TimeUnit.java:292) at org.apache.cassandra.utils.SimpleCondition.await(SimpleCondition.java:54) at org.apache.cassandra.service.AbstractWriteResponseHandler.get(AbstractWriteResponseHandler.java:55) at org.apache.cassandra.db.CounterMutationVerbHandler.doVerb(CounterMutationVerbHandler.java:51) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Locked ownable synchronizers: - locked 4b1b0a6f (a java.util.concurrent.locks.ReentrantLock$NonfairSync) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-4578) Dead lock in mutation stage when many concurrent writes to few columns
[ https://issues.apache.org/jira/browse/CASSANDRA-4578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-4578: Attachment: 4578.txt Attaching patch to use a callback (as it avoids creating lots of thread that just spend time waiting on a condition) to send back the response from CMVH. Dead lock in mutation stage when many concurrent writes to few columns -- Key: CASSANDRA-4578 URL: https://issues.apache.org/jira/browse/CASSANDRA-4578 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 0.8.0 Environment: 15 cassandra instances CentOS5 8 Core 64GB Memory java version 1.6.0_33 Java(TM) SE Runtime Environment (build 1.6.0_33-b04) Java HotSpot(TM) 64-Bit Server VM (build 20.8-b03, mixed mode) Reporter: Suguru Namura Assignee: Sylvain Lebresne Priority: Minor Fix For: 1.1.6 Attachments: 4578.txt, threaddump-1344957574788.tdump When I send many request to increment counters to few counter columns, sometimes mutation stage cause dead lock. When it happened, all of mutation threads are locked and do not accept updates any more. {noformat} MutationStage:432 - Thread t@1389 java.lang.Thread.State: TIMED_WAITING at java.lang.Object.wait(Native Method) - waiting on b90b45b (a org.apache.cassandra.utils.SimpleCondition) at java.lang.Object.wait(Object.java:443) at java.util.concurrent.TimeUnit.timedWait(TimeUnit.java:292) at org.apache.cassandra.utils.SimpleCondition.await(SimpleCondition.java:54) at org.apache.cassandra.service.AbstractWriteResponseHandler.get(AbstractWriteResponseHandler.java:55) at org.apache.cassandra.db.CounterMutationVerbHandler.doVerb(CounterMutationVerbHandler.java:51) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Locked ownable synchronizers: - locked 4b1b0a6f (a java.util.concurrent.locks.ReentrantLock$NonfairSync) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-4578) Dead lock in mutation stage when many concurrent writes to few columns
[ https://issues.apache.org/jira/browse/CASSANDRA-4578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-4578: -- Priority: Minor (was: Major) Affects Version/s: (was: 1.1.3) 0.8.0 Fix Version/s: 1.1.5 Assignee: Sylvain Lebresne You're right, since CMVH grabs a writer thread until it gets replies from the other replicas, you can have two replicas deadlock with A waiting for a reply from B, and B waiting for a reply from A. One fix would be to move the local write into CMVH and the remote part into a separate stage (or maybe just a custom callback). As a workaround, use CL.ONE with counters. Dead lock in mutation stage when many concurrent writes to few columns -- Key: CASSANDRA-4578 URL: https://issues.apache.org/jira/browse/CASSANDRA-4578 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 0.8.0 Environment: 15 cassandra instances CentOS5 8 Core 64GB Memory java version 1.6.0_33 Java(TM) SE Runtime Environment (build 1.6.0_33-b04) Java HotSpot(TM) 64-Bit Server VM (build 20.8-b03, mixed mode) Reporter: Suguru Namura Assignee: Sylvain Lebresne Priority: Minor Fix For: 1.1.5 Attachments: threaddump-1344957574788.tdump When I send many request to increment counters to few counter columns, sometimes mutation stage cause dead lock. When it happened, all of mutation threads are locked and do not accept updates any more. {noformat} MutationStage:432 - Thread t@1389 java.lang.Thread.State: TIMED_WAITING at java.lang.Object.wait(Native Method) - waiting on b90b45b (a org.apache.cassandra.utils.SimpleCondition) at java.lang.Object.wait(Object.java:443) at java.util.concurrent.TimeUnit.timedWait(TimeUnit.java:292) at org.apache.cassandra.utils.SimpleCondition.await(SimpleCondition.java:54) at org.apache.cassandra.service.AbstractWriteResponseHandler.get(AbstractWriteResponseHandler.java:55) at org.apache.cassandra.db.CounterMutationVerbHandler.doVerb(CounterMutationVerbHandler.java:51) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Locked ownable synchronizers: - locked 4b1b0a6f (a java.util.concurrent.locks.ReentrantLock$NonfairSync) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-4578) Dead lock in mutation stage when many concurrent writes to few columns
[ https://issues.apache.org/jira/browse/CASSANDRA-4578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suguru Namura updated CASSANDRA-4578: - Attachment: threaddump-1344957574788.tdump Attached thread dump Dead lock in mutation stage when many concurrent writes to few columns -- Key: CASSANDRA-4578 URL: https://issues.apache.org/jira/browse/CASSANDRA-4578 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.1.3 Environment: 15 cassandra instances CentOS5 8 Core 64GB Memory java version 1.6.0_33 Java(TM) SE Runtime Environment (build 1.6.0_33-b04) Java HotSpot(TM) 64-Bit Server VM (build 20.8-b03, mixed mode) Reporter: Suguru Namura Attachments: threaddump-1344957574788.tdump When I send many request to increment counters to few counter columns, sometimes mutation stage cause dead lock. When it happened, all of mutation threads are locked and do not accept updates any more. {noformat} MutationStage:432 - Thread t@1389 java.lang.Thread.State: TIMED_WAITING at java.lang.Object.wait(Native Method) - waiting on b90b45b (a org.apache.cassandra.utils.SimpleCondition) at java.lang.Object.wait(Object.java:443) at java.util.concurrent.TimeUnit.timedWait(TimeUnit.java:292) at org.apache.cassandra.utils.SimpleCondition.await(SimpleCondition.java:54) at org.apache.cassandra.service.AbstractWriteResponseHandler.get(AbstractWriteResponseHandler.java:55) at org.apache.cassandra.db.CounterMutationVerbHandler.doVerb(CounterMutationVerbHandler.java:51) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Locked ownable synchronizers: - locked 4b1b0a6f (a java.util.concurrent.locks.ReentrantLock$NonfairSync) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira