dzcxzl created SPARK-43301:
------------------------------

             Summary: BlockStoreClient getHostLocalDirs RPC supports 
IOexception retry
                 Key: SPARK-43301
                 URL: https://issues.apache.org/jira/browse/SPARK-43301
             Project: Spark
          Issue Type: Improvement
          Components: Spark Core
    Affects Versions: 3.0.0
            Reporter: dzcxzl


BlockStoreClient#getHostLocalDirs RPC did not retry when IOexception occurred, 
and then FetchFailedException was thrown.

 
{code:java}
23/04/24 01:24:55,158 [shuffle-client-7-1] WARN ExternalBlockStoreClient: Error 
while trying to get the host local dirs for [148]
23/04/24 01:24:55,158 [shuffle-client-7-1] ERROR ShuffleBlockFetcherIterator: 
Error occurred while fetching host local blocks
java.io.IOException: Connection reset by peer
        at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
        at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
        at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
        at sun.nio.ch.IOUtil.read(IOUtil.java:192)
        at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
        at io.netty.buffer.PooledByteBuf.setBytes(PooledByteBuf.java:253)
        at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:1132)
        at 
io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:350)
        at 
io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:151)
        at 
io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:719)
        at 
io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:655)
        at 
io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:581)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)
        at 
io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986)
        at 
io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
        at 
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
        at java.lang.Thread.run(Thread.java:745) {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to