[ 
https://issues.apache.org/jira/browse/STORM-80?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rick Kellogg updated STORM-80:
------------------------------
    Component/s: storm-core

> NPE caused by TridentBoltExecutor reusing TrackedBatches between batch groups
> -----------------------------------------------------------------------------
>
>                 Key: STORM-80
>                 URL: https://issues.apache.org/jira/browse/STORM-80
>             Project: Apache Storm
>          Issue Type: Bug
>          Components: storm-core
>            Reporter: James Xu
>
> https://github.com/nathanmarz/storm/issues/421
> I'm seeing intermittent errors caused by SubtopologyBolt.execute being called 
> with a BatchInfo whose ProcessorContext is set up for a different Batch 
> Group. In particular I'm seeing null pointer exceptions from 
> PartitionPersistProcessor because its state fields were never set up 
> correctly.
> The best I can tell the id key (IBatchID) being used for the _batches map in 
> TridentBoltExecutor is not unique between batch groups. As a result the 
> tracked batch will have been initialized for a different Batch Group and set 
> of processors.
> I hoped to be able to track down the source of this issue but can't determine 
> where the BatchIDs are being added to the tuples.
> If it matters, my topology has two streams each reading from their own 
> OpaqueTransactionalKafka spout w/different topics.
> Backtrace:
> 65108 [Thread-25] ERROR backtype.storm.daemon.executor - 
> java.lang.RuntimeException: java.lang.NullPointerException
>         at 
> backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:87)
>  ~[storm-0.9.0-wip4.jar:na]
>         at 
> backtype.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:58)
>  ~[storm-0.9.0-wip4.jar:na]
>         at 
> backtype.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:62)
>  ~[storm-0.9.0-wip4.jar:na]
>         at 
> backtype.storm.daemon.executor$fn__3551$fn__3563$fn__3610.invoke(executor.clj:712)
>  ~[storm-0.9.0-wip4.jar:na]
>         at backtype.storm.util$async_loop$fn__436.invoke(util.clj:377) 
> ~[storm-0.9.0-wip4.jar:na]
>         at clojure.lang.AFn.run(AFn.java:24) [clojure-1.4.0.jar:na]
>         at java.lang.Thread.run(Thread.java:722) [na:1.7.0_09]
> Caused by: java.lang.NullPointerException: null
>         at 
> storm.trident.planner.processor.PartitionPersistProcessor.execute(PartitionPersistProcessor.java:59)
>  ~[storm-0.9.0-wip4.jar:na]
>         at 
> storm.trident.planner.SubtopologyBolt$InitialReceiver.receive(SubtopologyBolt.java:189)
>  ~[storm-0.9.0-wip4.jar:na]
>         at 
> storm.trident.planner.SubtopologyBolt.execute(SubtopologyBolt.java:129) 
> ~[storm-0.9.0-wip4.jar:na]
>         at 
> storm.trident.topology.TridentBoltExecutor.execute(TridentBoltExecutor.java:352)
>  ~[storm-0.9.0-wip4.jar:na]
>         at 
> backtype.storm.daemon.executor$fn__3551$tuple_action_fn__3553.invoke(executor.clj:607)
>  ~[storm-0.9.0-wip4.jar:na]
>         at 
> backtype.storm.daemon.executor$mk_task_receiver$fn__3474.invoke(executor.clj:379)
>  ~[storm-0.9.0-wip4.jar:na]
>         at 
> backtype.storm.disruptor$clojure_handler$reify__3011.onEvent(disruptor.clj:43)
>  ~[storm-0.9.0-wip4.jar:na]
>         at 
> backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:84)
>  ~[storm-0.9.0-wip4.jar:na]
>         ... 6 common frames omitted
> Also, I'm only seeing this in LocalCluster mode, not in production.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to