Repository: spark
Updated Branches:
  refs/heads/branch-1.5 1038f677b -> 829c33a4b


[SPARK-10087] [CORE] [BRANCH-1.5] Disable spark.shuffle.reduceLocality.enabled 
by default.

https://issues.apache.org/jira/browse/SPARK-10087

In some cases, when spark.shuffle.reduceLocality.enabled is enabled, we are 
scheduling all reducers to the same executor (the cluster has plenty of 
resources). Changing spark.shuffle.reduceLocality.enabled to false resolve the 
problem.

Comments of https://github.com/apache/spark/pull/8280 provide more details of 
the symptom of this issue.

This PR changes the default setting of `spark.shuffle.reduceLocality.enabled` 
to `false` for branch 1.5.

Author: Yin Huai <[email protected]>

Closes #8296 from yhuai/setNumPartitionsCorrectly-branch1.5.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/829c33a4
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/829c33a4
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/829c33a4

Branch: refs/heads/branch-1.5
Commit: 829c33a4b4525b52f65a4cd69c7c86076506d35e
Parents: 1038f67
Author: Yin Huai <[email protected]>
Authored: Wed Aug 19 13:43:46 2015 -0700
Committer: Reynold Xin <[email protected]>
Committed: Wed Aug 19 13:43:46 2015 -0700

----------------------------------------------------------------------
 .../src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala | 2 +-
 .../scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala     | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/829c33a4/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala
----------------------------------------------------------------------
diff --git a/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala 
b/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala
index 684db66..591b714 100644
--- a/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala
+++ b/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala
@@ -138,7 +138,7 @@ class DAGScheduler(
 
   // Flag to control if reduce tasks are assigned preferred locations
   private val shuffleLocalityEnabled =
-    sc.getConf.getBoolean("spark.shuffle.reduceLocality.enabled", true)
+    sc.getConf.getBoolean("spark.shuffle.reduceLocality.enabled", false)
   // Number of map, reduce tasks above which we do not assign preferred 
locations
   // based on map output sizes. We limit the size of jobs for which assign 
preferred locations
   // as computing the top locations by size becomes expensive.

http://git-wip-us.apache.org/repos/asf/spark/blob/829c33a4/core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala
----------------------------------------------------------------------
diff --git 
a/core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala 
b/core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala
index 2e8688c..9c94751 100644
--- a/core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala
+++ b/core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala
@@ -909,7 +909,7 @@ class DAGSchedulerSuite
     assertDataStructuresEmpty()
   }
 
-  test("reduce tasks should be placed locally with map output") {
+  ignore("reduce tasks should be placed locally with map output") {
     // Create an shuffleMapRdd with 1 partition
     val shuffleMapRdd = new MyRDD(sc, 1, Nil)
     val shuffleDep = new ShuffleDependency(shuffleMapRdd, null)
@@ -929,7 +929,7 @@ class DAGSchedulerSuite
     assertDataStructuresEmpty()
   }
 
-  test("reduce task locality preferences should only include machines with 
largest map outputs") {
+  ignore("reduce task locality preferences should only include machines with 
largest map outputs") {
     val numMapTasks = 4
     // Create an shuffleMapRdd with more partitions
     val shuffleMapRdd = new MyRDD(sc, numMapTasks, Nil)


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to