[ https://issues.apache.org/jira/browse/SPARK-52923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
gaoyajun02 updated SPARK-52923: ------------------------------- Description: Background: Although spark.shuffle.push.enabled can globally control push-based shuffle, there are scenarios that require more fine-grained control: - Mass job migration scenarios where different jobs may need different shuffle strategies - Remote shuffle managers need shuffle-level fallback capabilities from push-based shuffle - Dynamic decision making based on shuffle characteristics during shuffle registration Currently, the _shuffleMergeAllowed flag in ShuffleDependency is initialized before shuffle registration, preventing ShuffleManager implementations from modifying this capability based on their specific requirements or runtime conditions. This change moves the initialization of numPartitions and _shuffleMergeAllowed before the shuffleManager.registerShuffle() call, allowing ShuffleManager implementations to dynamically disable push-based shuffle merge during registration if needed. was: Background: Although spark.shuffle.push.enabled can globally control push-based shuffle, there are scenarios that require more fine-grained control: - Mass job migration scenarios where different jobs may need different shuffle strategies - External remote shuffle services that need shuffle-level fallback from push-based shuffle - Dynamic decision making based on shuffle characteristics during registration Currently, the _shuffleMergeAllowed flag in ShuffleDependency is initialized before shuffle registration, preventing ShuffleManager implementations from modifying this capability based on their specific requirements or runtime conditions. This change moves the initialization of numPartitions and _shuffleMergeAllowed before the shuffleManager.registerShuffle() call, allowing ShuffleManager implementations to dynamically disable push-based shuffle merge during registration if needed. > Allow ShuffleManager to control push merge during shuffle registration > ---------------------------------------------------------------------- > > Key: SPARK-52923 > URL: https://issues.apache.org/jira/browse/SPARK-52923 > Project: Spark > Issue Type: Improvement > Components: Shuffle > Affects Versions: 3.5.0, 4.0.0 > Reporter: gaoyajun02 > Priority: Minor > > Background: > Although spark.shuffle.push.enabled can globally control push-based shuffle, > there are scenarios that require more fine-grained control: > - Mass job migration scenarios where different jobs may need different > shuffle strategies > - Remote shuffle managers need shuffle-level fallback capabilities from > push-based shuffle > - Dynamic decision making based on shuffle characteristics during shuffle > registration > > Currently, the _shuffleMergeAllowed flag in ShuffleDependency is initialized > before shuffle registration, preventing ShuffleManager implementations from > modifying this capability based on their specific requirements or runtime > conditions. > > This change moves the initialization of numPartitions and > _shuffleMergeAllowed before the shuffleManager.registerShuffle() call, > allowing ShuffleManager implementations to dynamically disable push-based > shuffle merge during registration if needed. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org