[ https://issues.apache.org/jira/browse/SPARK-52923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
gaoyajun02 updated SPARK-52923: ------------------------------- Description: Background: While `spark.shuffle.push.enabled` provides global control over push-based shuffle, there are scenarios requiring more granular control: - Mass job migration scenarios where different jobs may need different shuffle strategies - Remote shuffle managers need shuffle-level fallback capabilities to push-based shuffle - Dynamic decision making based on shuffle characteristics during shuffle registration Currently, the _shuffleMergeAllowed flag in ShuffleDependency is initialized before shuffle registration, preventing ShuffleManager implementations from modifying this capability based on their specific requirements or runtime conditions. This change moves the initialization of numPartitions and _shuffleMergeAllowed before the shuffleManager.registerShuffle() call, allowing ShuffleManager implementations to dynamically disable push-based shuffle merge during registration if needed. was: Background: While `spark.shuffle.push.enabled` provides global control over push-based shuffle, there are scenarios requiring more granular control: - Mass job migration scenarios where different jobs may need different shuffle strategies - Remote shuffle managers need shuffle-level fallback capabilities from push-based shuffle - Dynamic decision making based on shuffle characteristics during shuffle registration Currently, the _shuffleMergeAllowed flag in ShuffleDependency is initialized before shuffle registration, preventing ShuffleManager implementations from modifying this capability based on their specific requirements or runtime conditions. This change moves the initialization of numPartitions and _shuffleMergeAllowed before the shuffleManager.registerShuffle() call, allowing ShuffleManager implementations to dynamically disable push-based shuffle merge during registration if needed. > Allow ShuffleManager to control push merge during shuffle registration > ---------------------------------------------------------------------- > > Key: SPARK-52923 > URL: https://issues.apache.org/jira/browse/SPARK-52923 > Project: Spark > Issue Type: Improvement > Components: Shuffle > Affects Versions: 3.5.0, 4.0.0 > Reporter: gaoyajun02 > Priority: Minor > Labels: pull-request-available > > Background: > While `spark.shuffle.push.enabled` provides global control over push-based > shuffle, there are scenarios requiring more granular control: > - Mass job migration scenarios where different jobs may need different > shuffle strategies > - Remote shuffle managers need shuffle-level fallback capabilities to > push-based shuffle > - Dynamic decision making based on shuffle characteristics during shuffle > registration > > Currently, the _shuffleMergeAllowed flag in ShuffleDependency is initialized > before shuffle registration, preventing ShuffleManager implementations from > modifying this capability based on their specific requirements or runtime > conditions. > > This change moves the initialization of numPartitions and > _shuffleMergeAllowed before the shuffleManager.registerShuffle() call, > allowing ShuffleManager implementations to dynamically disable push-based > shuffle merge during registration if needed. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org