[
https://issues.apache.org/jira/browse/STORM-3422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Petr Janeček updated STORM-3422:
--------------------------------
Description:
Marking this as Major, but the problem lies in testing code. This makes
integration testing hard, but the issue does not affect any production code.
First, let me show you a stack trace for Storm 2.0.0:
{{java.lang.RuntimeException: java.lang.NullPointerException}}
{{at org.apache.storm.executor.Executor.accept(Executor.java:282)
~[storm-client-2.0.0.jar:2.0.0]}}
{{at org.apache.storm.utils.JCQueue.consumeImpl(JCQueue.java:133)
~[storm-client-2.0.0.jar:2.0.0]}}
{{at org.apache.storm.utils.JCQueue.consume(JCQueue.java:110)
~[storm-client-2.0.0.jar:2.0.0]}}
{{at org.apache.storm.executor.bolt.BoltExecutor$1.call(BoltExecutor.java:171)
~[storm-client-2.0.0.jar:2.0.0]}}
{{at org.apache.storm.executor.bolt.BoltExecutor$1.call(BoltExecutor.java:158)
~[storm-client-2.0.0.jar:2.0.0]}}
{{at org.apache.storm.utils.Utils$1.run(Utils.java:388)
[storm-client-2.0.0.jar:2.0.0]}}
{{at java.lang.Thread.run(Thread.java:745) [?:1.8.0_121]}}
{{Caused by: java.lang.NullPointerException}}
{{at
org.apache.storm.testing.TupleCaptureBolt.execute(TupleCaptureBolt.java:45)
~[storm-client-2.0.0.jar:2.0.0]}}
{{at
org.apache.storm.executor.bolt.BoltExecutor.tupleActionFn(BoltExecutor.java:234)
~[storm-client-2.0.0.jar:2.0.0]}}
{{at org.apache.storm.executor.Executor.accept(Executor.java:275)
~[storm-client-2.0.0.jar:2.0.0]}}
{{... 6 more}}
Here's the same for Storm 1.2.2:
{{java.lang.RuntimeException: java.lang.NullPointerException}}
{{at
org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:522)
~[storm-core-1.2.2.jar:1.2.2]}}
{{at
org.apache.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:487)
~[storm-core-1.2.2.jar:1.2.2]}}
{{at
org.apache.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:74)
~[storm-core-1.2.2.jar:1.2.2]}}
{{at
org.apache.storm.daemon.executor$fn__10795$fn__10808$fn__10861.invoke(executor.clj:861)
~[storm-core-1.2.2.jar:1.2.2]}}
{{at org.apache.storm.util$async_loop$fn__553.invoke(util.clj:484)
[storm-core-1.2.2.jar:1.2.2]}}
{{at clojure.lang.AFn.run(AFn.java:22) [clojure-1.7.0.jar:?]}}
{{at java.lang.Thread.run(Thread.java:745) [?:1.8.0_121]}}
{{Caused by: java.lang.NullPointerException}}
{{at
org.apache.storm.testing.TupleCaptureBolt.execute(TupleCaptureBolt.java:50)
~[storm-core-1.2.2.jar:1.2.2]}}
{{at
org.apache.storm.daemon.executor$fn__10795$tuple_action_fn__10797.invoke(executor.clj:739)
~[storm-core-1.2.2.jar:1.2.2]}}
{{at
org.apache.storm.daemon.executor$mk_task_receiver$fn__10716.invoke(executor.clj:468)
~[storm-core-1.2.2.jar:1.2.2]}}
{{at
org.apache.storm.disruptor$clojure_handler$reify__10135.onEvent(disruptor.clj:41)
~[storm-core-1.2.2.jar:1.2.2]}}
{{at
org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:509)
~[storm-core-1.2.2.jar:1.2.2]}}
{{... 6 more}}
This is a topology running as our integration test using
{{Testing.completeTopology()}}. Both the stack traces point to the same code in
the {{TupleCaptureBolt}} - its {{name}} field is not safely published (it
should be marked {{final}}), and the internal {{HashMap}} does not safely store
the data put in it. Perhaps it should be a {{ConcurrentHashMap}}?
Would you accept a PR with a more detailed analysis, or are you going to
investigate on your side?
was:
Marking this as Major, but the problem lies in testing code. This makes
integration testing hard, but the issue does not affect any production code.
First, let me show you a stack trace for Storm 2.0.0:
{{java.lang.RuntimeException: java.lang.NullPointerException}}
{{ at org.apache.storm.executor.Executor.accept(Executor.java:282)
~[storm-client-2.0.0.jar:2.0.0]}}
{{ at org.apache.storm.utils.JCQueue.consumeImpl(JCQueue.java:133)
~[storm-client-2.0.0.jar:2.0.0]}}
{{ at org.apache.storm.utils.JCQueue.consume(JCQueue.java:110)
~[storm-client-2.0.0.jar:2.0.0]}}
{{ at org.apache.storm.executor.bolt.BoltExecutor$1.call(BoltExecutor.java:171)
~[storm-client-2.0.0.jar:2.0.0]}}
{{ at org.apache.storm.executor.bolt.BoltExecutor$1.call(BoltExecutor.java:158)
~[storm-client-2.0.0.jar:2.0.0]}}
{{ at org.apache.storm.utils.Utils$1.run(Utils.java:388)
[storm-client-2.0.0.jar:2.0.0]}}
{{ at java.lang.Thread.run(Thread.java:745) [?:1.8.0_121]}}
{{Caused by: java.lang.NullPointerException}}
{{ at
org.apache.storm.testing.TupleCaptureBolt.execute(TupleCaptureBolt.java:45)
~[storm-client-2.0.0.jar:2.0.0]}}
{{ at
org.apache.storm.executor.bolt.BoltExecutor.tupleActionFn(BoltExecutor.java:234)
~[storm-client-2.0.0.jar:2.0.0]}}
{{ at org.apache.storm.executor.Executor.accept(Executor.java:275)
~[storm-client-2.0.0.jar:2.0.0]}}
{{ ... 6 more}}
Here's the same for Storm 1.2.2:
{{java.lang.RuntimeException: java.lang.NullPointerException}}
{{ at
org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:522)
~[storm-core-1.2.2.jar:1.2.2]}}
{{ at
org.apache.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:487)
~[storm-core-1.2.2.jar:1.2.2]}}
{{ at
org.apache.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:74)
~[storm-core-1.2.2.jar:1.2.2]}}
{{ at
org.apache.storm.daemon.executor$fn__10795$fn__10808$fn__10861.invoke(executor.clj:861)
~[storm-core-1.2.2.jar:1.2.2]}}
{{ at org.apache.storm.util$async_loop$fn__553.invoke(util.clj:484)
[storm-core-1.2.2.jar:1.2.2]}}
{{ at clojure.lang.AFn.run(AFn.java:22) [clojure-1.7.0.jar:?]}}
{{ at java.lang.Thread.run(Thread.java:745) [?:1.8.0_121]}}
{{Caused by: java.lang.NullPointerException}}
{{ at
org.apache.storm.testing.TupleCaptureBolt.execute(TupleCaptureBolt.java:50)
~[storm-core-1.2.2.jar:1.2.2]}}
{{ at
org.apache.storm.daemon.executor$fn__10795$tuple_action_fn__10797.invoke(executor.clj:739)
~[storm-core-1.2.2.jar:1.2.2]}}
{{ at
org.apache.storm.daemon.executor$mk_task_receiver$fn__10716.invoke(executor.clj:468)
~[storm-core-1.2.2.jar:1.2.2]}}
{{ at
org.apache.storm.disruptor$clojure_handler$reify__10135.onEvent(disruptor.clj:41)
~[storm-core-1.2.2.jar:1.2.2]}}
{{ at
org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:509)
~[storm-core-1.2.2.jar:1.2.2]}}
{{ ... 6 more}}
This is a topology running as our integration test using
{{Testing.completeTopology()}}. Both the stack traces point to the same code in
the {{TupleCaptureBolt}} - its {{name}} field is not safely published (it
should be marked {{final}}), and the internal {{HashMap}} does not safely store
the data put in it. Perhaps it should be a {{ConcurrentHashMap}}?
Would you accept a PR with a more detailed analysis, or are you going to
investigate on your side?
> TupleCaptureBolt seems to be not thread-safe
> --------------------------------------------
>
> Key: STORM-3422
> URL: https://issues.apache.org/jira/browse/STORM-3422
> Project: Apache Storm
> Issue Type: Bug
> Components: storm-client
> Affects Versions: 2.0.0, 1.2.2
> Reporter: Petr Janeček
> Priority: Major
>
> Marking this as Major, but the problem lies in testing code. This makes
> integration testing hard, but the issue does not affect any production code.
>
> First, let me show you a stack trace for Storm 2.0.0:
> {{java.lang.RuntimeException: java.lang.NullPointerException}}
> {{at org.apache.storm.executor.Executor.accept(Executor.java:282)
> ~[storm-client-2.0.0.jar:2.0.0]}}
> {{at org.apache.storm.utils.JCQueue.consumeImpl(JCQueue.java:133)
> ~[storm-client-2.0.0.jar:2.0.0]}}
> {{at org.apache.storm.utils.JCQueue.consume(JCQueue.java:110)
> ~[storm-client-2.0.0.jar:2.0.0]}}
> {{at
> org.apache.storm.executor.bolt.BoltExecutor$1.call(BoltExecutor.java:171)
> ~[storm-client-2.0.0.jar:2.0.0]}}
> {{at
> org.apache.storm.executor.bolt.BoltExecutor$1.call(BoltExecutor.java:158)
> ~[storm-client-2.0.0.jar:2.0.0]}}
> {{at org.apache.storm.utils.Utils$1.run(Utils.java:388)
> [storm-client-2.0.0.jar:2.0.0]}}
> {{at java.lang.Thread.run(Thread.java:745) [?:1.8.0_121]}}
> {{Caused by: java.lang.NullPointerException}}
> {{at
> org.apache.storm.testing.TupleCaptureBolt.execute(TupleCaptureBolt.java:45)
> ~[storm-client-2.0.0.jar:2.0.0]}}
> {{at
> org.apache.storm.executor.bolt.BoltExecutor.tupleActionFn(BoltExecutor.java:234)
> ~[storm-client-2.0.0.jar:2.0.0]}}
> {{at org.apache.storm.executor.Executor.accept(Executor.java:275)
> ~[storm-client-2.0.0.jar:2.0.0]}}
> {{... 6 more}}
>
> Here's the same for Storm 1.2.2:
> {{java.lang.RuntimeException: java.lang.NullPointerException}}
> {{at
> org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:522)
> ~[storm-core-1.2.2.jar:1.2.2]}}
> {{at
> org.apache.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:487)
> ~[storm-core-1.2.2.jar:1.2.2]}}
> {{at
> org.apache.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:74)
> ~[storm-core-1.2.2.jar:1.2.2]}}
> {{at
> org.apache.storm.daemon.executor$fn__10795$fn__10808$fn__10861.invoke(executor.clj:861)
> ~[storm-core-1.2.2.jar:1.2.2]}}
> {{at org.apache.storm.util$async_loop$fn__553.invoke(util.clj:484)
> [storm-core-1.2.2.jar:1.2.2]}}
> {{at clojure.lang.AFn.run(AFn.java:22) [clojure-1.7.0.jar:?]}}
> {{at java.lang.Thread.run(Thread.java:745) [?:1.8.0_121]}}
> {{Caused by: java.lang.NullPointerException}}
> {{at
> org.apache.storm.testing.TupleCaptureBolt.execute(TupleCaptureBolt.java:50)
> ~[storm-core-1.2.2.jar:1.2.2]}}
> {{at
> org.apache.storm.daemon.executor$fn__10795$tuple_action_fn__10797.invoke(executor.clj:739)
> ~[storm-core-1.2.2.jar:1.2.2]}}
> {{at
> org.apache.storm.daemon.executor$mk_task_receiver$fn__10716.invoke(executor.clj:468)
> ~[storm-core-1.2.2.jar:1.2.2]}}
> {{at
> org.apache.storm.disruptor$clojure_handler$reify__10135.onEvent(disruptor.clj:41)
> ~[storm-core-1.2.2.jar:1.2.2]}}
> {{at
> org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:509)
> ~[storm-core-1.2.2.jar:1.2.2]}}
> {{... 6 more}}
>
> This is a topology running as our integration test using
> {{Testing.completeTopology()}}. Both the stack traces point to the same code
> in the {{TupleCaptureBolt}} - its {{name}} field is not safely published (it
> should be marked {{final}}), and the internal {{HashMap}} does not safely
> store the data put in it. Perhaps it should be a {{ConcurrentHashMap}}?
> Would you accept a PR with a more detailed analysis, or are you going to
> investigate on your side?
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)