[ 
https://issues.apache.org/jira/browse/STORM-770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14497686#comment-14497686
 ] 

Michael Pershyn commented on STORM-770:
---------------------------------------

Regarding nimbus-node in 2nd case - everything seems to be fine.
The worker in our case is detected ~32 seconds later after exception happened 
as {code}:timed-out{code} and then is tried to be killed by supervisor, but is 
already dead. See the supervisor log above.

In regards to recovering the topology functioning on it's own, in our case we 
have observed such behaviour some times ago (on storm 0.9.2-incubating) and 
made a conclusion that this was STORM-537.
On storm-0.9.3 this issue reported to be fixed, and I confirm that in our case 
same topology could return to functioning state on it's own.

> NullPointerException in consumeBatchToCursor
> --------------------------------------------
>
>                 Key: STORM-770
>                 URL: https://issues.apache.org/jira/browse/STORM-770
>             Project: Apache Storm
>          Issue Type: Bug
>    Affects Versions: 0.9.2-incubating
>            Reporter: Stas Levin
>
> We got the following exception after our topology had been up for ~2 days, 
> and I was wondering if it might be related. 
> Looks like "task" in "mk-transfer-fn" is null, making "(.add remote 
> (TaskMessage. task (.serialize serializer tuple)))" fail on NPE 
> (worker.clj:128, storm-core-0.9.2-incubating.jar)
> java.lang.RuntimeException: java.lang.NullPointerException
> at 
> backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:128)
>  ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]
> at 
> backtype.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:99)
>  ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]
> at 
> backtype.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:80)
>  ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]
> at 
> backtype.storm.disruptor$consume_loop_STAR_$fn__758.invoke(disruptor.clj:94) 
> ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]
> at backtype.storm.util$async_loop$fn__457.invoke(util.clj:431) 
> ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]
> at clojure.lang.AFn.run(AFn.java:24) [clojure-1.5.1.jar:na]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_72]
> Caused by: java.lang.NullPointerException: null
> at clojure.lang.RT.intCast(RT.java:1087) ~[clojure-1.5.1.jar:na]
> at 
> backtype.storm.daemon.worker$mk_transfer_fn$fn__5748.invoke(worker.clj:128) 
> ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]
> at 
> backtype.storm.daemon.executor$start_batch_transfer_GT_worker_handler_BANG$fn__5483.invoke(executor.clj:256)
>  ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]
> at 
> backtype.storm.disruptor$clojure_handler$reify__745.onEvent(disruptor.clj:58) 
> ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]
> at 
> backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:125)
>  ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]
> ... 6 common frames omitted,java.lang.RuntimeException: 
> java.lang.NullPointerException
> Any ideas?
> P.S.
> Also saw it here: 
> http://mail-archives.apache.org/mod_mbox/storm-user/201501.mbox/%3CCABcMBhCusXXU=v1e66wfuatgyh1euqnd1siog65-tp8xlwx...@mail.gmail.com%3E
> https://mail-archives.apache.org/mod_mbox/storm-user/201408.mbox/%3ccajuqm_4kxhsh2_x08ujuqr76m2c+dswp0fcijbmfcaeyqgs...@mail.gmail.com%3E
> Comment from Bobby
> http://mail-archives.apache.org/mod_mbox/storm-user/201501.mbox/%3c574363643.2791948.1420470097280.javamail.ya...@jws10027.mail.ne1.yahoo.com%3E
> {quote}
> What version of storm are you using?  Are any of the bolts shell bolts?  
> There is a known
> issue where this can happen if two shell bolts share an executor, because 
> they are multi-threaded. 
> - Bobby
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to