Which version of spark? Looks like you are hitting this one
https://issues.apache.org/jira/browse/SPARK-4516

Thanks
Best Regards

On Wed, Jun 3, 2015 at 1:06 PM, patcharee <patcharee.thong...@uni.no> wrote:

>  This is log I can get>
>
> 15/06/02 16:37:31 INFO shuffle.RetryingBlockFetcher: Retrying fetch (2/3)
> for 4 outstanding blocks after 5000 ms
> 15/06/02 16:37:36 INFO client.TransportClientFactory: Found inactive
> connection to compute-10-3.local/10.10.255.238:33671, creating a new one.
> 15/06/02 16:37:36 WARN server.TransportChannelHandler: Exception in
> connection from /10.10.255.238:35430
> java.io.IOException: Connection reset by peer
>         at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
>         at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
>         at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
>         at sun.nio.ch.IOUtil.read(IOUtil.java:192)
>         at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
>         at
> io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:311)
>         at
> io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:881)
>         at
> io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:225)
>         at
> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:119)
>         at
> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
>         at
> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
>         at
> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
>         at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
>         at
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
>         at java.lang.Thread.run(Thread.java:744)
> 15/06/02 16:37:36 ERROR server.TransportRequestHandler: Error sending
> result
> ChunkFetchSuccess{streamChunkId=StreamChunkId{streamId=1033433133943,
> chunkIndex=1},
> buffer=FileSegmentManagedBuffer{file=/hdisk3/hadoop/yarn/local/usercache/patcharee/appcache/application_1432633634512_0213/blockmgr-12d59e6b-0895-4a0e-9d06-152d2f7ee855/09/shuffle_0_56_0.data,
> offset=896, length=1132499356}} to /10.10.255.238:35430; closing
> connection
> java.nio.channels.ClosedChannelException
> 15/06/02 16:37:38 ERROR shuffle.RetryingBlockFetcher: Exception while
> beginning fetch of 4 outstanding blocks (after 2 retries)
> java.io.IOException: Failed to connect to compute-10-3.local/
> 10.10.255.238:33671
>         at
> org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:191)
>         at
> org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:156)
>         at
> org.apache.spark.network.netty.NettyBlockTransferService$$anon$1.createAndStart(NettyBlockTransferService.scala:78)
>         at
> org.apache.spark.network.shuffle.RetryingBlockFetcher.fetchAllOutstanding(RetryingBlockFetcher.java:140)
>         at
> org.apache.spark.network.shuffle.RetryingBlockFetcher.access$200(RetryingBlockFetcher.java:43)
>         at
> org.apache.spark.network.shuffle.RetryingBlockFetcher$1.run(RetryingBlockFetcher.java:170)
>         at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:744)
> Caused by: java.net.ConnectException: Connection refused:
> compute-10-3.local/10.10.255.238:33671
>         at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>         at
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:735)
>         at
> io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:208)
>         at
> io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:287)
>         at
> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:528)
>         at
> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
>         at
> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
>         at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
>         at
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
>         ... 1 more
>
>
>
> Best,
> Patcharee
>
>
> On 03. juni 2015 09:21, Akhil Das wrote:
>
>  You need to look into your executor/worker logs to see whats going on.
>
>  Thanks
> Best Regards
>
> On Wed, Jun 3, 2015 at 12:01 PM, patcharee <patcharee.thong...@uni.no>
> wrote:
>
>> Hi,
>>
>> What can be the cause of this ERROR cluster.YarnScheduler: Lost executor?
>> How can I fix it?
>>
>> Best,
>> Patcharee
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>>
>>
>
>

Reply via email to