I unfortunately haven't seen this directly. But some typical things I try
when debugging are as follows.

Do you see a corresponding error on the other side of that connection
(alpinenode7.alpinenow.local)? Or is that the same machine?

Also, do the driver logs show any longer stack trace and have you enabled
the history server, so you can see some more details about execution? That
helps me tremendously.

-Suren



On Wed, Jun 25, 2014 at 11:08 PM, Sung Hwan Chung <coded...@cs.stanford.edu>
wrote:

> I'm seeing the following message in the log of an executor. Anyone seen this 
> error? After this, the executor seems to lose the cache, and but besides that 
> the whole thing slows down drastically - I.e. it gets stuck in a reduce phase 
> for 40+ minutes, whereas before it was finishing reduces in 2~3 seconds.
>
>
>
> 14/06/25 19:22:31 WARN SendingConnection: Error writing in connection to 
> ConnectionManagerId(alpinenode7.alpinenow.local,46251)
> java.lang.NullPointerException
>       at 
> org.apache.spark.network.MessageChunkHeader.buffer$lzycompute(MessageChunkHeader.scala:35)
>       at 
> org.apache.spark.network.MessageChunkHeader.buffer(MessageChunkHeader.scala:32)
>       at 
> org.apache.spark.network.MessageChunk.buffers$lzycompute(MessageChunk.scala:31)
>       at org.apache.spark.network.MessageChunk.buffers(MessageChunk.scala:29)
>       at 
> org.apache.spark.network.SendingConnection.write(Connection.scala:349)
>       at 
> org.apache.spark.network.ConnectionManager$$anon$5.run(ConnectionManager.scala:142)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>       at java.lang.Thread.run(Thread.java:724)
>
>


-- 

SUREN HIRAMAN, VP TECHNOLOGY
Velos
Accelerating Machine Learning

440 NINTH AVENUE, 11TH FLOOR
NEW YORK, NY 10001
O: (917) 525-2466 ext. 105
F: 646.349.4063
E: suren.hiraman@v <suren.hira...@sociocast.com>elos.io
W: www.velos.io

Reply via email to