[
https://issues.apache.org/jira/browse/STORM-299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13984398#comment-13984398
]
Robert Joseph Evans commented on STORM-299:
-------------------------------------------
This seems to indicate that a tick tuple was trying to publish to the receive
queue of an executor, but could not because the queue was full. While at the
same time an executor, possibly a different one, was waiting for data to be
available on its queue. I'm not sure that there is any way to tie the two
together, in fact if the two are for the same queue it probably means that you
have found a bug in disruptor. If this is reproducible, we should be able to
get to the bottom of it, so long as you are OK with sharing stack traces, and
possibly a heap dump of a worker experiencing this issue. If this is something
you are not going to be able to share it will be much more difficult to track
down what is happening.
> tick tuples stop in a long running topology
> -------------------------------------------
>
> Key: STORM-299
> URL: https://issues.apache.org/jira/browse/STORM-299
> Project: Apache Storm (Incubating)
> Issue Type: Bug
> Affects Versions: 0.9.0.1
> Reporter: Sushant Kumar
>
> We are running a topology that processes about 10K tuples a second on a
> single worker. There are about 80 executor threads across 3 different Bolt
> types, each receiving a tick tuple at a frequency of 1 second. We observe
> that the tick tuples stop coming in after few hours or sometime even after a
> day. The nimbus UI also shows that the tick tuples count has frozen whereas
> the same set of bolts continue to process tuples from other spouts. I have
> taken a stack trace and it shows following:
> "Thread-8" daemon prio=10 tid=0x00007f1a0fed2000 nid=0x4438 runnable
> [0x00007f19a5e01000]
> java.lang.Thread.State: TIMED_WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> at
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:349)
> at
> com.lmax.disruptor.AbstractMultithreadedClaimStrategy.waitForFreeSlotAt(AbstractMultithreadedClaimStrategy.java:99)
> at
> com.lmax.disruptor.AbstractMultithreadedClaimStrategy.incrementAndGet(AbstractMultithreadedClaimStrategy.java:49)
> at com.lmax.disruptor.Sequencer.next(Sequencer.java:127)
> at
> backtype.storm.utils.DisruptorQueue.publish(DisruptorQueue.java:113)
> at backtype.storm.disruptor$publish.invoke(disruptor.clj:51)
> at backtype.storm.disruptor$publish.invoke(disruptor.clj:53)
> at
> backtype.storm.daemon.executor$setup_ticks_BANG_$fn__3885.invoke(executor.clj:299)
> at
> backtype.storm.timer$schedule_recurring$this__1839.invoke(timer.clj:79)
> at
> backtype.storm.timer$mk_timer$thread_fn__1822$fn__1823.invoke(timer.clj:32)
> at backtype.storm.timer$mk_timer$thread_fn__1822.invoke(timer.clj:25)
> at clojure.lang.AFn.run(AFn.java:24)
> at java.lang.Thread.run(Thread.java:744)
>
> Locked ownable synchronizers:
> - None
>
> "Thread-268-__system" prio=10 tid=0x00007f1a0ef45000 nid=0x46b5 runnable
> [0x00007f17ef574000]
> java.lang.Thread.State: TIMED_WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for <0x000000073d9a2060> (a
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
> at
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2176)
> at
> com.lmax.disruptor.BlockingWaitStrategy.waitFor(BlockingWaitStrategy.java:87)
> at
> com.lmax.disruptor.ProcessingSequenceBarrier.waitFor(ProcessingSequenceBarrier.java:54)
> at
> backtype.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:56)
> at
> backtype.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:62)
> at
> backtype.storm.daemon.executor$eval4023$fn__4024$fn__4038$fn__4089.invoke(executor.clj:749)
> at backtype.storm.util$async_loop$fn__364.invoke(util.clj:400)
> at clojure.lang.AFn.run(AFn.java:24)
> at java.lang.Thread.run(Thread.java:744)
>
> Locked ownable synchronizers:
> - None
> We are running storm-0.9.0_wip21 and I would be glad to provide any other
> information.
> Also see https://issues.apache.org/jira/browse/STORM-106.
> Thanks,
> Sushant
--
This message was sent by Atlassian JIRA
(v6.2#6252)