Petr Janeček created STORM-3422:
-----------------------------------

             Summary: TupleCaptureBolt seems to be not thread-safe
                 Key: STORM-3422
                 URL: https://issues.apache.org/jira/browse/STORM-3422
             Project: Apache Storm
          Issue Type: Bug
          Components: storm-client
    Affects Versions: 1.2.2, 2.0.0
            Reporter: Petr Janeček


Marking this as Major, but the problem lies in testing code. This makes 
integration testing hard, but the issue does not affect any production code.

 

First, let me show you a stack trace for Storm 2.0.0:

{{java.lang.RuntimeException: java.lang.NullPointerException}}
{{ at org.apache.storm.executor.Executor.accept(Executor.java:282) 
~[storm-client-2.0.0.jar:2.0.0]}}
{{ at org.apache.storm.utils.JCQueue.consumeImpl(JCQueue.java:133) 
~[storm-client-2.0.0.jar:2.0.0]}}
{{ at org.apache.storm.utils.JCQueue.consume(JCQueue.java:110) 
~[storm-client-2.0.0.jar:2.0.0]}}
{{ at org.apache.storm.executor.bolt.BoltExecutor$1.call(BoltExecutor.java:171) 
~[storm-client-2.0.0.jar:2.0.0]}}
{{ at org.apache.storm.executor.bolt.BoltExecutor$1.call(BoltExecutor.java:158) 
~[storm-client-2.0.0.jar:2.0.0]}}
{{ at org.apache.storm.utils.Utils$1.run(Utils.java:388) 
[storm-client-2.0.0.jar:2.0.0]}}
{{ at java.lang.Thread.run(Thread.java:745) [?:1.8.0_121]}}
{{Caused by: java.lang.NullPointerException}}
{{ at 
org.apache.storm.testing.TupleCaptureBolt.execute(TupleCaptureBolt.java:45) 
~[storm-client-2.0.0.jar:2.0.0]}}
{{ at 
org.apache.storm.executor.bolt.BoltExecutor.tupleActionFn(BoltExecutor.java:234)
 ~[storm-client-2.0.0.jar:2.0.0]}}
{{ at org.apache.storm.executor.Executor.accept(Executor.java:275) 
~[storm-client-2.0.0.jar:2.0.0]}}
{{ ... 6 more}}

 

 Here's the same for Storm 1.2.2:

{{java.lang.RuntimeException: java.lang.NullPointerException}}
{{ at 
org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:522)
 ~[storm-core-1.2.2.jar:1.2.2]}}
{{ at 
org.apache.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:487)
 ~[storm-core-1.2.2.jar:1.2.2]}}
{{ at 
org.apache.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:74)
 ~[storm-core-1.2.2.jar:1.2.2]}}
{{ at 
org.apache.storm.daemon.executor$fn__10795$fn__10808$fn__10861.invoke(executor.clj:861)
 ~[storm-core-1.2.2.jar:1.2.2]}}
{{ at org.apache.storm.util$async_loop$fn__553.invoke(util.clj:484) 
[storm-core-1.2.2.jar:1.2.2]}}
{{ at clojure.lang.AFn.run(AFn.java:22) [clojure-1.7.0.jar:?]}}
{{ at java.lang.Thread.run(Thread.java:745) [?:1.8.0_121]}}
{{Caused by: java.lang.NullPointerException}}
{{ at 
org.apache.storm.testing.TupleCaptureBolt.execute(TupleCaptureBolt.java:50) 
~[storm-core-1.2.2.jar:1.2.2]}}
{{ at 
org.apache.storm.daemon.executor$fn__10795$tuple_action_fn__10797.invoke(executor.clj:739)
 ~[storm-core-1.2.2.jar:1.2.2]}}
{{ at 
org.apache.storm.daemon.executor$mk_task_receiver$fn__10716.invoke(executor.clj:468)
 ~[storm-core-1.2.2.jar:1.2.2]}}
{{ at 
org.apache.storm.disruptor$clojure_handler$reify__10135.onEvent(disruptor.clj:41)
 ~[storm-core-1.2.2.jar:1.2.2]}}
{{ at 
org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:509)
 ~[storm-core-1.2.2.jar:1.2.2]}}
{{ ... 6 more}}

 

This is a topology running as our integration test using 
{{Testing.completeTopology()}}. Both the stack traces point to the same code in 
the {{TupleCaptureBolt}} - its {{name}} field is not safely published (it 
should be marked {{final}}), and the internal {{HashMap}} does not safely store 
the data put in it. Perhaps it should be a {{ConcurrentHashMap}}?

Would you accept a PR with a more detailed analysis, or are you going to 
investigate on your side?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to