[ https://issues.apache.org/jira/browse/SPARK-40096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Apache Spark reassigned SPARK-40096: ------------------------------------ Assignee: (was: Apache Spark) > Finalize shuffle merge slow due to connection creation fails > ------------------------------------------------------------ > > Key: SPARK-40096 > URL: https://issues.apache.org/jira/browse/SPARK-40096 > Project: Spark > Issue Type: Bug > Components: Spark Core > Affects Versions: 3.3.0 > Reporter: Wan Kun > Priority: Major > > *How to reproduce this issue* > * Enable push based shuffle > * Remove some merger nodes before sending finalize RPCs > * Driver try to connect those merger shuffle services and send finalize RPC > one by one, each connection creation will timeout after > SPARK_NETWORK_IO_CONNECTIONCREATIONTIMEOUT_KEY (120s by default) > > We can send these RPCs in *shuffleMergeFinalizeScheduler* thread pool and > handle the connection creation exception -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org