[
https://issues.apache.org/jira/browse/STORM-770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14496240#comment-14496240
]
Stas Levin edited comment on STORM-770 at 4/15/15 2:36 PM:
-----------------------------------------------------------
The exception reported was observed in one of the bolts, after we had
experimented with substantially increasing the traffic being handled by the
topology at hand. Its core logic has not been changed in the past few months.
Perhaps it's worth noting that the exception happened all of a sudden, there
were no previous warnings or indications of distress in the logs. Once the
exception took place, the topology could not recover and went into a restart
loop.
Topology details:
We're using 'fieldsGrouping', running 2 workers per machine, where the topology
at hand is assigned 10 workers (i.e., it is running on 5 nodes). Each worker is
assigned 24 executors.
We have a total of 5 components, 4 bolts and 1 kafka spout.
Direct emit is not employed.
was (Author: staslev):
The exception reported was observed in one of the bolts, after we had
experimented with substantially increasing the traffic being handled by the
topology at hand. Its core logic has not been changed in the past few months.
Topology details:
We're using 'fieldsGrouping', running 2 workers per machine, where the topology
at hand is assigned 10 workers (i.e., it is running on 5 nodes). Each worker is
assigned 24 executors.
We have a total of 5 components, 4 bolts and 1 kafka spout.
Direct emit is not employed.
> NullPointerException in consumeBatchToCursor
> --------------------------------------------
>
> Key: STORM-770
> URL: https://issues.apache.org/jira/browse/STORM-770
> Project: Apache Storm
> Issue Type: Bug
> Affects Versions: 0.9.2-incubating
> Reporter: Stas Levin
>
> We got the following exception after our topology had been up for ~2 days,
> and I was wondering if it might be related.
> Looks like "task" in "mk-transfer-fn" is null, making "(.add remote
> (TaskMessage. task (.serialize serializer tuple)))" fail on NPE
> (worker.clj:128, storm-core-0.9.2-incubating.jar)
> java.lang.RuntimeException: java.lang.NullPointerException
> at
> backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:128)
> ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]
> at
> backtype.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:99)
> ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]
> at
> backtype.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:80)
> ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]
> at
> backtype.storm.disruptor$consume_loop_STAR_$fn__758.invoke(disruptor.clj:94)
> ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]
> at backtype.storm.util$async_loop$fn__457.invoke(util.clj:431)
> ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]
> at clojure.lang.AFn.run(AFn.java:24) [clojure-1.5.1.jar:na]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_72]
> Caused by: java.lang.NullPointerException: null
> at clojure.lang.RT.intCast(RT.java:1087) ~[clojure-1.5.1.jar:na]
> at
> backtype.storm.daemon.worker$mk_transfer_fn$fn__5748.invoke(worker.clj:128)
> ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]
> at
> backtype.storm.daemon.executor$start_batch_transfer_GT_worker_handler_BANG$fn__5483.invoke(executor.clj:256)
> ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]
> at
> backtype.storm.disruptor$clojure_handler$reify__745.onEvent(disruptor.clj:58)
> ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]
> at
> backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:125)
> ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]
> ... 6 common frames omitted,java.lang.RuntimeException:
> java.lang.NullPointerException
> Any ideas?
> P.S.
> Also saw it here:
> http://mail-archives.apache.org/mod_mbox/storm-user/201501.mbox/%3CCABcMBhCusXXU=v1e66wfuatgyh1euqnd1siog65-tp8xlwx...@mail.gmail.com%3E
> https://mail-archives.apache.org/mod_mbox/storm-user/201408.mbox/%3ccajuqm_4kxhsh2_x08ujuqr76m2c+dswp0fcijbmfcaeyqgs...@mail.gmail.com%3E
> Comment from Bobby
> http://mail-archives.apache.org/mod_mbox/storm-user/201501.mbox/%3c574363643.2791948.1420470097280.javamail.ya...@jws10027.mail.ne1.yahoo.com%3E
> {quote}
> What version of storm are you using? Are any of the bolts shell bolts?
> There is a known
> issue where this can happen if two shell bolts share an executor, because
> they are multi-threaded.
> - Bobby
> {quote}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)