[ https://issues.apache.org/jira/browse/SPARK-38965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Wan Kun updated SPARK-38965: ---------------------------- Summary: Optimize RemoteBlockPushResolver with a memory pool (was: Retry transfer blocks for exceptions listed in the error handler ) > Optimize RemoteBlockPushResolver with a memory pool > --------------------------------------------------- > > Key: SPARK-38965 > URL: https://issues.apache.org/jira/browse/SPARK-38965 > Project: Spark > Issue Type: Bug > Components: Shuffle > Affects Versions: 3.3.0 > Reporter: Wan Kun > Priority: Minor > > For push-based shuffle service, there are many > {{BLOCK_APPEND_COLLISION_DETECTED}} when there are many small map tasks > outputs. In {{{}RemoteBlockPushResolver{}}}, if one map task pushed blocks is > writing, the others map tasks pushed blocks will failed in {{onComplete()}} > method. > And {{RemoteBlockPushResolver}} has no memory limit , so many executors will > OOM when there are many small pushed blocks waiting to be written to the > final data file. -- This message was sent by Atlassian Jira (v8.20.7#820007) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org