[
https://issues.apache.org/jira/browse/SPARK-43301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated SPARK-43301:
-----------------------------------
Labels: pull-request-available (was: )
> BlockStoreClient getHostLocalDirs RPC supports IOException retry
> ----------------------------------------------------------------
>
> Key: SPARK-43301
> URL: https://issues.apache.org/jira/browse/SPARK-43301
> Project: Spark
> Issue Type: Improvement
> Components: Spark Core
> Affects Versions: 3.0.0
> Reporter: dzcxzl
> Priority: Minor
> Labels: pull-request-available
>
> BlockStoreClient#getHostLocalDirs RPC did not retry when IOexception
> occurred, and then FetchFailedException was thrown.
>
> {code:java}
> 23/04/24 01:24:55,158 [shuffle-client-7-1] WARN ExternalBlockStoreClient:
> Error while trying to get the host local dirs for [148]
> 23/04/24 01:24:55,158 [shuffle-client-7-1] ERROR ShuffleBlockFetcherIterator:
> Error occurred while fetching host local blocks
> java.io.IOException: Connection reset by peer
> at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
> at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
> at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
> at sun.nio.ch.IOUtil.read(IOUtil.java:192)
> at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
> at io.netty.buffer.PooledByteBuf.setBytes(PooledByteBuf.java:253)
> at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:1132)
> at
> io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:350)
> at
> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:151)
> at
> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:719)
> at
> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:655)
> at
> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:581)
> at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)
> at
> io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986)
> at
> io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
> at
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> at java.lang.Thread.run(Thread.java:745) {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]