[
https://issues.apache.org/jira/browse/STORM-770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14497686#comment-14497686
]
Michael Pershyn commented on STORM-770:
---------------------------------------
Regarding nimbus-node in 2nd case - everything seems to be fine.
The worker in our case is detected ~32 seconds later after exception happened
as {code}:timed-out{code} and then is tried to be killed by supervisor, but is
already dead. See the supervisor log above.
In regards to recovering the topology functioning on it's own, in our case we
have observed such behaviour some times ago (on storm 0.9.2-incubating) and
made a conclusion that this was STORM-537.
On storm-0.9.3 this issue reported to be fixed, and I confirm that in our case
same topology could return to functioning state on it's own.
> NullPointerException in consumeBatchToCursor
> --------------------------------------------
>
> Key: STORM-770
> URL: https://issues.apache.org/jira/browse/STORM-770
> Project: Apache Storm
> Issue Type: Bug
> Affects Versions: 0.9.2-incubating
> Reporter: Stas Levin
>
> We got the following exception after our topology had been up for ~2 days,
> and I was wondering if it might be related.
> Looks like "task" in "mk-transfer-fn" is null, making "(.add remote
> (TaskMessage. task (.serialize serializer tuple)))" fail on NPE
> (worker.clj:128, storm-core-0.9.2-incubating.jar)
> java.lang.RuntimeException: java.lang.NullPointerException
> at
> backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:128)
> ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]
> at
> backtype.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:99)
> ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]
> at
> backtype.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:80)
> ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]
> at
> backtype.storm.disruptor$consume_loop_STAR_$fn__758.invoke(disruptor.clj:94)
> ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]
> at backtype.storm.util$async_loop$fn__457.invoke(util.clj:431)
> ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]
> at clojure.lang.AFn.run(AFn.java:24) [clojure-1.5.1.jar:na]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_72]
> Caused by: java.lang.NullPointerException: null
> at clojure.lang.RT.intCast(RT.java:1087) ~[clojure-1.5.1.jar:na]
> at
> backtype.storm.daemon.worker$mk_transfer_fn$fn__5748.invoke(worker.clj:128)
> ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]
> at
> backtype.storm.daemon.executor$start_batch_transfer_GT_worker_handler_BANG$fn__5483.invoke(executor.clj:256)
> ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]
> at
> backtype.storm.disruptor$clojure_handler$reify__745.onEvent(disruptor.clj:58)
> ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]
> at
> backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:125)
> ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]
> ... 6 common frames omitted,java.lang.RuntimeException:
> java.lang.NullPointerException
> Any ideas?
> P.S.
> Also saw it here:
> http://mail-archives.apache.org/mod_mbox/storm-user/201501.mbox/%3CCABcMBhCusXXU=v1e66wfuatgyh1euqnd1siog65-tp8xlwx...@mail.gmail.com%3E
> https://mail-archives.apache.org/mod_mbox/storm-user/201408.mbox/%3ccajuqm_4kxhsh2_x08ujuqr76m2c+dswp0fcijbmfcaeyqgs...@mail.gmail.com%3E
> Comment from Bobby
> http://mail-archives.apache.org/mod_mbox/storm-user/201501.mbox/%3c574363643.2791948.1420470097280.javamail.ya...@jws10027.mail.ne1.yahoo.com%3E
> {quote}
> What version of storm are you using? Are any of the bolts shell bolts?
> There is a known
> issue where this can happen if two shell bolts share an executor, because
> they are multi-threaded.
> - Bobby
> {quote}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)