This is an automated email from the ASF dual-hosted git repository.

zuston pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-uniffle.git


The following commit(s) were added to refs/heads/master by this push:
     new 8ea1fc78f [#1490] improvement(spark3): Disable dynamic allocation 
shuffle tracking by default (#1491)
8ea1fc78f is described below

commit 8ea1fc78f7ea723fc15afecc06459613cb52c47a
Author: Zhen Wang <[email protected]>
AuthorDate: Tue Jan 30 10:24:53 2024 +0800

    [#1490] improvement(spark3): Disable dynamic allocation shuffle tracking by 
default (#1491)
    
    ### What changes were proposed in this pull request?
    
    When using uniffle in Spark 3.5, I found that the executor did not exit in 
time when DRA was enabled.
    
    ```
    spark.dynamicAllocation.enabled true
    spark.shuffle.manager   org.apache.spark.shuffle.RssShuffleManager
    spark.shuffle.service.enabled   false
    spark.shuffle.sort.io.plugin.class      
org.apache.spark.shuffle.RssShuffleDataIo
    ```
    
    As mentioned in 
[SPARK-39846](https://issues.apache.org/jira/browse/SPARK-39846), shuffle 
tracking is enabled by default in Spark 3.4.0.
    
    When we disable shuffle service and enable shuffle tracking by default, the 
executor will only exit idle after shuffle cleanup. So we should disable 
shuffle tracking by default.
    
    refer: 
https://github.com/apache/spark/blob/8dd395b2eabd2815982022b38a5287dae7af8b82/core/src/main/scala/org/apache/spark/scheduler/dynalloc/ExecutorMonitor.scala#L55
    
    
    enable shuffle tracking (by default in spark 3.5):
    
    
![image](https://github.com/apache/incubator-uniffle/assets/17894939/a6d92a5b-9005-4481-868a-3e4065416d3f)
    
    
    after disable shuffle tracking:
    
    
![image](https://github.com/apache/incubator-uniffle/assets/17894939/02671143-6162-43a3-84cf-1b9eac1db338)
    
    
    ### Why are the changes needed?
    
    Fix: #1490
    
    ### Does this PR introduce _any_ user-facing change?
    
    No.
    
    ### How was this patch tested?
    
    Existing tests.
---
 .../spark3/src/main/java/org/apache/spark/shuffle/RssShuffleManager.java | 1 +
 1 file changed, 1 insertion(+)

diff --git 
a/client-spark/spark3/src/main/java/org/apache/spark/shuffle/RssShuffleManager.java
 
b/client-spark/spark3/src/main/java/org/apache/spark/shuffle/RssShuffleManager.java
index 27c8bb8ba..625089118 100644
--- 
a/client-spark/spark3/src/main/java/org/apache/spark/shuffle/RssShuffleManager.java
+++ 
b/client-spark/spark3/src/main/java/org/apache/spark/shuffle/RssShuffleManager.java
@@ -221,6 +221,7 @@ public class RssShuffleManager extends 
RssShuffleManagerBase {
     RssSparkShuffleUtils.validateRssClientConf(sparkConf);
     // External shuffle service is not supported when using remote shuffle 
service
     sparkConf.set("spark.shuffle.service.enabled", "false");
+    sparkConf.set("spark.dynamicAllocation.shuffleTracking.enabled", "false");
     LOG.info("Disable external shuffle service in RssShuffleManager.");
     sparkConf.set("spark.sql.adaptive.localShuffleReader.enabled", "false");
     LOG.info("Disable local shuffle reader in RssShuffleManager.");

Reply via email to