DynamicSnitch race in adding latencies
--------------------------------------
Key: CASSANDRA-2618
URL: https://issues.apache.org/jira/browse/CASSANDRA-2618
Project: Cassandra
Issue Type: Bug
Components: Core
Affects Versions: 0.7.0
Reporter: Brandon Williams
Assignee: Brandon Williams
ERROR 15:33:48,614 Fatal exception in thread Thread[ReadStage:264,5,main]
java.lang.RuntimeException: java.util.NoSuchElementException
at
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:680)
Caused by: java.util.NoSuchElementException
at
java.util.concurrent.LinkedBlockingDeque.removeFirst(LinkedBlockingDeque.java:401)
at
java.util.concurrent.LinkedBlockingDeque.remove(LinkedBlockingDeque.java:621)
at
org.apache.cassandra.locator.AdaptiveLatencyTracker.add(DynamicEndpointSnitch.java:288)
at
org.apache.cassandra.locator.DynamicEndpointSnitch.receiveTiming(DynamicEndpointSnitch.java:202)
at
org.apache.cassandra.net.MessagingService.addLatency(MessagingService.java:152)
at
org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:642)
at
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
... 3 more
ERROR 15:33:48,615 Fatal exception in thread Thread[ReadStage:264,5,main]
java.lang.RuntimeException: java.util.NoSuchElementException
at
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:680)
Caused by: java.util.NoSuchElementException
at
java.util.concurrent.LinkedBlockingDeque.removeFirst(LinkedBlockingDeque.java:401)
at
java.util.concurrent.LinkedBlockingDeque.remove(LinkedBlockingDeque.java:621)
at
org.apache.cassandra.locator.AdaptiveLatencyTracker.add(DynamicEndpointSnitch.java:288)
at
org.apache.cassandra.locator.DynamicEndpointSnitch.receiveTiming(DynamicEndpointSnitch.java:202)
at
org.apache.cassandra.net.MessagingService.addLatency(MessagingService.java:152)
at
org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:642)
at
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
... 3 more
What is happening that AdaptiveLatencyTracker.add is trying to add a latency,
but the deque is full, so it makes a second effort to remove an entry from the
deque and then try to add again. However, when it tries to remove, the deque
has already been emptied by DES.reset call clear() on all the ALTs. This bug
has existed for a long time, but it's very rare and difficult to trigger.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira