[ 
https://issues.apache.org/jira/browse/STORM-120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13972138#comment-13972138
 ] 

Patrick Lucas commented on STORM-120:
-------------------------------------

I believe I'm closing in on a cause: I don't think ShellBolts (and presumably 
spouts) are threadsafe when there are multiple running in a single executor.

In ShellBolt.prepare, two threads are created: 
[one|https://github.com/apache/incubator-storm/blob/1a0b46e95ab4ac467525314a75819a75dec92c40/storm-core/src/jvm/backtype/storm/task/ShellBolt.java#L109]
 for reading from a synchronous queue of events populated by ShellBolt.execute 
and writing to stdin of the subprocess, and 
[one|https://github.com/apache/incubator-storm/blob/1a0b46e95ab4ac467525314a75819a75dec92c40/storm-core/src/jvm/backtype/storm/task/ShellBolt.java#L141]
 for reading from stdout of the subprocess and calling emit/ack/fail/etc. on 
the OutputCollector. I believe these calls are the source of the thread safety 
problems.

If there are two ShellBolt tasks in the same executor, there will be two 
"readerThread"s emitting in parallel. These calls need to be synchronized on 
some aspect of the OutputCollector: To test this theory, I 
[patched|https://github.com/patricklucas/incubator-storm/blob/caeb80d3402fa4eeae7e03871379c18f8f4e8838/storm-core/src/clj/backtype/storm/daemon/executor.clj#L689]
 Storm to lock on the executor's grouper, and have not seen any of these errors 
having turned up the tasks-per-executor on my bolts. (I don't intend to offer 
this patch to merge; just as a proof-of-concept)

> util/acquire-random-range-id is not thread-safe
> -----------------------------------------------
>
>                 Key: STORM-120
>                 URL: https://issues.apache.org/jira/browse/STORM-120
>             Project: Apache Storm (Incubating)
>          Issue Type: Bug
>            Reporter: James Xu
>            Priority: Minor
>
> https://github.com/nathanmarz/storm/issues/724
> Concurrent calls to util/acquire-random-range-id with the same parameters can 
> result in an IndexOutOfBoundsException, as an increment in one thread may 
> occur after the bounds check in another. The resulting curr value can be >= 
> the size of the List state.
> https://github.com/nathanmarz/storm/blob/fc5fbb8b352cf91050cdde4a9f9e77e673ab7f48/storm-core/src/clj/backtype/storm/util.clj#L606



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to