Zhongwei Zhu created SPARK-42104:
------------------------------------

             Summary: Throw ExecutorDeadException in fetchBlocks when executor 
dead
                 Key: SPARK-42104
                 URL: https://issues.apache.org/jira/browse/SPARK-42104
             Project: Spark
          Issue Type: Bug
          Components: Spark Core
    Affects Versions: 3.3.1
            Reporter: Zhongwei Zhu


When fetchBlocks failed due to IOException, ExecutorDeadException will be 
thrown when executor is dead.

There're other cases that executor dead will cause TimeoutException or other 
Exceptions.
{code:java}
Caused by: java.lang.RuntimeException: java.util.concurrent.TimeoutException: 
Waited 30000 milliseconds (plus 143334 nanoseconds delay) for 
SettableFuture@624de392[status=PENDING]
    at org.sparkproject.guava.base.Throwables.propagate(Throwables.java:243)
    at 
org.apache.spark.network.client.TransportClient.sendRpcSync(TransportClient.java:293)
    at 
org.apache.spark.network.crypto.AuthClientBootstrap.doSparkAuth(AuthClientBootstrap.java:113)
    at 
org.apache.spark.network.crypto.AuthClientBootstrap.doBootstrap(AuthClientBootstrap.java:80)
    at 
org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:300)
    at 
org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:218)
    at 
org.apache.spark.network.netty.NettyBlockTransferService$$anon$2.createAndStart(NettyBlockTransferService.scala:126)
    at 
org.apache.spark.network.shuffle.RetryingBlockTransferor.transferAllOutstanding(RetryingBlockTransferor.java:154)
    at 
org.apache.spark.network.shuffle.RetryingBlockTransferor.lambda$initiateRetry$0(RetryingBlockTransferor.java:184)
    at 
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
    at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
    at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
    at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
    at 
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
 {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to