We are running a topology that processes about 10K tuples a second on a single 
worker. There are about 80 executor threads across 3 different Bolt types, each 
receiving a tick tuple at a frequency of 1 second. We observe that the tick 
tuples stop coming in after few hours or sometime even after a day. The nimbus 
UI also shows that the tick tuples count has frozen whereas the bolts continue 
to process streams from other spouts. I have taken a stack trace and it shows 
following:

"Thread-8" daemon prio=10 tid=0x00007f1a0fed2000
nid=0x4438 runnable [0x00007f19a5e01000]
   java.lang.Thread.State: TIMED_WAITING (parking)
        at
sun.misc.Unsafe.park(Native Method)
        at
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:349)
        at
com.lmax.disruptor.AbstractMultithreadedClaimStrategy.waitForFreeSlotAt(AbstractMultithreadedClaimStrategy.java:99)
        at
com.lmax.disruptor.AbstractMultithreadedClaimStrategy.incrementAndGet(AbstractMultithreadedClaimStrategy.java:49)
        at
com.lmax.disruptor.Sequencer.next(Sequencer.java:127)
        at
backtype.storm.utils.DisruptorQueue.publish(DisruptorQueue.java:113)
        at
backtype.storm.disruptor$publish.invoke(disruptor.clj:51)
        at
backtype.storm.disruptor$publish.invoke(disruptor.clj:53)
        at
backtype.storm.daemon.executor$setup_ticks_BANG_$fn__3885.invoke(executor.clj:299)
        at
backtype.storm.timer$schedule_recurring$this__1839.invoke(timer.clj:79)
        at
backtype.storm.timer$mk_timer$thread_fn__1822$fn__1823.invoke(timer.clj:32)
        at
backtype.storm.timer$mk_timer$thread_fn__1822.invoke(timer.clj:25)
        at
clojure.lang.AFn.run(AFn.java:24)
        at
java.lang.Thread.run(Thread.java:744)
 
   Locked ownable synchronizers:
        - None
 
"Thread-268-__system" prio=10
tid=0x00007f1a0ef45000 nid=0x46b5 runnable [0x00007f17ef574000]
   java.lang.Thread.State: TIMED_WAITING (parking)
        at
sun.misc.Unsafe.park(Native Method)
        - parking to wait
for  <0x000000073d9a2060> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
        at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2176)
        at
com.lmax.disruptor.BlockingWaitStrategy.waitFor(BlockingWaitStrategy.java:87)
        at
com.lmax.disruptor.ProcessingSequenceBarrier.waitFor(ProcessingSequenceBarrier.java:54)
        at
backtype.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:56)
        at
backtype.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:62)
        at
backtype.storm.daemon.executor$eval4023$fn__4024$fn__4038$fn__4089.invoke(executor.clj:749)
        at
backtype.storm.util$async_loop$fn__364.invoke(util.clj:400)
        at
clojure.lang.AFn.run(AFn.java:24)
        at
java.lang.Thread.run(Thread.java:744)
 
   Locked ownable synchronizers:
        - None

Any pointer on this. We are running storm-0.9.0_wip21.

Thanks,
Sushant

Reply via email to