[
https://issues.apache.org/jira/browse/STORM-80?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rick Kellogg updated STORM-80:
------------------------------
Component/s: storm-core
> NPE caused by TridentBoltExecutor reusing TrackedBatches between batch groups
> -----------------------------------------------------------------------------
>
> Key: STORM-80
> URL: https://issues.apache.org/jira/browse/STORM-80
> Project: Apache Storm
> Issue Type: Bug
> Components: storm-core
> Reporter: James Xu
>
> https://github.com/nathanmarz/storm/issues/421
> I'm seeing intermittent errors caused by SubtopologyBolt.execute being called
> with a BatchInfo whose ProcessorContext is set up for a different Batch
> Group. In particular I'm seeing null pointer exceptions from
> PartitionPersistProcessor because its state fields were never set up
> correctly.
> The best I can tell the id key (IBatchID) being used for the _batches map in
> TridentBoltExecutor is not unique between batch groups. As a result the
> tracked batch will have been initialized for a different Batch Group and set
> of processors.
> I hoped to be able to track down the source of this issue but can't determine
> where the BatchIDs are being added to the tuples.
> If it matters, my topology has two streams each reading from their own
> OpaqueTransactionalKafka spout w/different topics.
> Backtrace:
> 65108 [Thread-25] ERROR backtype.storm.daemon.executor -
> java.lang.RuntimeException: java.lang.NullPointerException
> at
> backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:87)
> ~[storm-0.9.0-wip4.jar:na]
> at
> backtype.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:58)
> ~[storm-0.9.0-wip4.jar:na]
> at
> backtype.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:62)
> ~[storm-0.9.0-wip4.jar:na]
> at
> backtype.storm.daemon.executor$fn__3551$fn__3563$fn__3610.invoke(executor.clj:712)
> ~[storm-0.9.0-wip4.jar:na]
> at backtype.storm.util$async_loop$fn__436.invoke(util.clj:377)
> ~[storm-0.9.0-wip4.jar:na]
> at clojure.lang.AFn.run(AFn.java:24) [clojure-1.4.0.jar:na]
> at java.lang.Thread.run(Thread.java:722) [na:1.7.0_09]
> Caused by: java.lang.NullPointerException: null
> at
> storm.trident.planner.processor.PartitionPersistProcessor.execute(PartitionPersistProcessor.java:59)
> ~[storm-0.9.0-wip4.jar:na]
> at
> storm.trident.planner.SubtopologyBolt$InitialReceiver.receive(SubtopologyBolt.java:189)
> ~[storm-0.9.0-wip4.jar:na]
> at
> storm.trident.planner.SubtopologyBolt.execute(SubtopologyBolt.java:129)
> ~[storm-0.9.0-wip4.jar:na]
> at
> storm.trident.topology.TridentBoltExecutor.execute(TridentBoltExecutor.java:352)
> ~[storm-0.9.0-wip4.jar:na]
> at
> backtype.storm.daemon.executor$fn__3551$tuple_action_fn__3553.invoke(executor.clj:607)
> ~[storm-0.9.0-wip4.jar:na]
> at
> backtype.storm.daemon.executor$mk_task_receiver$fn__3474.invoke(executor.clj:379)
> ~[storm-0.9.0-wip4.jar:na]
> at
> backtype.storm.disruptor$clojure_handler$reify__3011.onEvent(disruptor.clj:43)
> ~[storm-0.9.0-wip4.jar:na]
> at
> backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:84)
> ~[storm-0.9.0-wip4.jar:na]
> ... 6 common frames omitted
> Also, I'm only seeing this in LocalCluster mode, not in production.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)