[ 
https://issues.apache.org/jira/browse/STORM-770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14507221#comment-14507221
 ] 

Michael Pershyn commented on STORM-770:
---------------------------------------

So, the 0.9.3 deploy went fine, it is about 3 hours running. I have not seen 
the exception, but I have seen the 2 WARN messages that we are looking for. We 
have all logs machine-processed and indexed for search, so I can guarantee, 
than in the cluster (6 machines) in last 3 hours only 2 of such WARN occurred.
However, the scenario is different now.

The messages are unfortunately not as helpful as expected, because there is no 
tuple information printed.

The exception happens on different bolts of different topologies. 
But both of them do some prepare operations - they prepare sessions and 
statements for work with Cassandra.


{code}

2015-04-22T14:30:44.689+0200 b.s.d.executor [INFO] Prepared bolt db-reader:(4)
2015-04-22T14:30:44.789+0200 b.s.d.worker [WARN] Can't transfer tuple - task 
value is null. tuple information: 

// ... after this it works normally.

{code}

on the second bolt it is the same, topology and the name of the bolt is 
different.

{code}

2015-04-22T14:51:48.847+0200 b.s.d.executor [INFO] Prepared bolt db-writer:(28)
2015-04-22T14:51:55.525+0200 b.s.d.worker [WARN] Can't transfer tuple - task 
value is null. tuple information: 

{code}


What may be a hint, is that storm is using netty for message transfer, and so 
does cassandra, which is initialized in prepare-method in these bolts. I doubt 
that this could somehow interfere, but decided to mention this.

These are cassandra driver and it's dependencies.
{code}
[com.datastax.cassandra/cassandra-driver-core "2.1.2"]
     [com.codahale.metrics/metrics-core "3.0.2"]
     [io.netty/netty "3.9.0.Final"]
{code}

Storm 0.9.3 is using same version {{<netty.version>3.9.0.Final</netty.version>}}

I will let the patched 0.9.3 running and will observe if the issue appear 
further and under which circumstances.

> NullPointerException in consumeBatchToCursor
> --------------------------------------------
>
>                 Key: STORM-770
>                 URL: https://issues.apache.org/jira/browse/STORM-770
>             Project: Apache Storm
>          Issue Type: Bug
>    Affects Versions: 0.9.2-incubating
>            Reporter: Stas Levin
>
> We got the following exception after our topology had been up for ~2 days, 
> and I was wondering if it might be related. 
> Looks like "task" in "mk-transfer-fn" is null, making "(.add remote 
> (TaskMessage. task (.serialize serializer tuple)))" fail on NPE 
> (worker.clj:128, storm-core-0.9.2-incubating.jar)
> java.lang.RuntimeException: java.lang.NullPointerException
> at 
> backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:128)
>  ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]
> at 
> backtype.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:99)
>  ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]
> at 
> backtype.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:80)
>  ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]
> at 
> backtype.storm.disruptor$consume_loop_STAR_$fn__758.invoke(disruptor.clj:94) 
> ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]
> at backtype.storm.util$async_loop$fn__457.invoke(util.clj:431) 
> ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]
> at clojure.lang.AFn.run(AFn.java:24) [clojure-1.5.1.jar:na]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_72]
> Caused by: java.lang.NullPointerException: null
> at clojure.lang.RT.intCast(RT.java:1087) ~[clojure-1.5.1.jar:na]
> at 
> backtype.storm.daemon.worker$mk_transfer_fn$fn__5748.invoke(worker.clj:128) 
> ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]
> at 
> backtype.storm.daemon.executor$start_batch_transfer_GT_worker_handler_BANG$fn__5483.invoke(executor.clj:256)
>  ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]
> at 
> backtype.storm.disruptor$clojure_handler$reify__745.onEvent(disruptor.clj:58) 
> ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]
> at 
> backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:125)
>  ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]
> ... 6 common frames omitted,java.lang.RuntimeException: 
> java.lang.NullPointerException
> Any ideas?
> P.S.
> Also saw it here: 
> http://mail-archives.apache.org/mod_mbox/storm-user/201501.mbox/%3CCABcMBhCusXXU=v1e66wfuatgyh1euqnd1siog65-tp8xlwx...@mail.gmail.com%3E
> https://mail-archives.apache.org/mod_mbox/storm-user/201408.mbox/%3ccajuqm_4kxhsh2_x08ujuqr76m2c+dswp0fcijbmfcaeyqgs...@mail.gmail.com%3E
> Comment from Bobby
> http://mail-archives.apache.org/mod_mbox/storm-user/201501.mbox/%3c574363643.2791948.1420470097280.javamail.ya...@jws10027.mail.ne1.yahoo.com%3E
> {quote}
> What version of storm are you using?  Are any of the bolts shell bolts?  
> There is a known
> issue where this can happen if two shell bolts share an executor, because 
> they are multi-threaded. 
> - Bobby
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to