[ 
https://issues.apache.org/jira/browse/STORM-292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13978575#comment-13978575
 ] 

Kishor Patil commented on STORM-292:
------------------------------------

Since SplitAndJoinTopology is cyclic. I was looking at  Nathan's previous 
commits to "fix deadlock bug due to variant of dining philosophers problem" 
using overflow-buffer for spouts. 

https://github.com/nathanmarz/storm/commit/1a9dca46abe4c937e6b5874a9d1b178163a95af4
https://groups.google.com/forum/#!msg/storm-user/c1g_s5L8yuI/JV5q94SCnWQJ

I think storm should extend the use of overflow-buffer to bolts as well to 
avoid deadlock on bolt while publishing to queue with no free slots. Any 
thoughts?


> emit blocks the publishing bolt if disrupter queue is full   
> -------------------------------------------------------------
>
>                 Key: STORM-292
>                 URL: https://issues.apache.org/jira/browse/STORM-292
>             Project: Apache Storm (Incubating)
>          Issue Type: Bug
>    Affects Versions: 0.9.2-incubating
>            Reporter: Kishor Patil
>
> During testing, i notice once the disruptor queue is full, it blocks 
> (timed_wait ) the publishing bolt essentially creating slowdown live-lock. 
> Should the outputCollector.emit be a non-blocking call?
> Also, configs for better control on the disruptor queue/buffer size seem to 
> not have any impact.
>  
> "topology.executor.receive.buffer.size"
> "topology.receiver.buffer.size"
> "topology.executor.send.buffer.size"
> "topology.transfer.buffer.size"
> Below is example of topology that re-creates live lock scenario        with 
> disruptor queue.
> https://github.com/kishorvpatil/incubator-storm/blob/dqueue-full/examples/storm-starter/src/jvm/storm/starter/SplitJoinTopology.java
> "Thread-9-splitjoinbolt" prio=10 tid=0x00007fe518a9f000 nid=0x64b5 waiting on 
> condition [0x00007fe5144f2000]
>    java.lang.Thread.State: TIMED_WAITING (parking)
>         at sun.misc.Unsafe.park(Native Method)
>         at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:349)
>         at 
> com.lmax.disruptor.SingleThreadedClaimStrategy.waitForFreeSlotAt(SingleThreadedClaimStrategy.java:129)
>         at 
> com.lmax.disruptor.SingleThreadedClaimStrategy.incrementAndGet(SingleThreadedClaimStrategy.java:81)
>         at com.lmax.disruptor.Sequencer.next(Sequencer.java:127)
>         at 
> backtype.storm.utils.DisruptorQueue.publish(DisruptorQueue.java:113)
>         at backtype.storm.disruptor$publish.invoke(disruptor.clj:51)
>         at 
> backtype.storm.daemon.executor$mk_executor_transfer_fn$this__4913.invoke(executor.clj:176)
>         at 
> backtype.storm.daemon.executor$mk_executor_transfer_fn$this__4913.invoke(executor.clj:183)
>         at 
> backtype.storm.daemon.executor$mk_executor_transfer_fn$this__4913.invoke(executor.clj:185)
>         at 
> backtype.storm.daemon.executor$fn__5141$fn__5155$bolt_emit__5184.invoke(executor.clj:683)
>         at 
> backtype.storm.daemon.executor$fn__5141$fn$reify__5190.emit(executor.clj:693)
>         at backtype.storm.task.OutputCollector.emit(OutputCollector.java:186)
>         at backtype.storm.task.OutputCollector.emit(OutputCollector.java:32)
>         at 
> backtype.storm.topology.BasicOutputCollector.emit(BasicOutputCollector.java:19)
>         at 
> storm.starter.bolt.SplitAndCountBolt.execute(SplitAndCountBolt.java:24)
>         at 
> backtype.storm.topology.BasicBoltExecutor.execute(BasicBoltExecutor.java:33)
>         at 
> backtype.storm.daemon.executor$fn__5141$tuple_action_fn__5143.invoke(executor.clj:634)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to