Repository: spark
Updated Branches:
  refs/heads/master 9bd80ad6b -> cafc696d0


[HOTFIX][CORE] fix flaky BasicSchedulerIntegrationTest

## What changes were proposed in this pull request?

SPARK-15927 exacerbated a race in BasicSchedulerIntegrationTest, so it went 
from very unlikely to fairly frequent.  The issue is that stage numbering is 
not completely deterministic, but these tests treated it like it was.  So turn 
off the tests.

## How was this patch tested?

on my laptop the test failed abotu 10% of the time before this change, and 
didn't fail in 500 runs after the change.

Author: Imran Rashid <[email protected]>

Closes #13688 from squito/hotfix_basic_scheduler.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/cafc696d
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/cafc696d
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/cafc696d

Branch: refs/heads/master
Commit: cafc696d095ae06dd64805574d55a19637743aa6
Parents: 9bd80ad
Author: Imran Rashid <[email protected]>
Authored: Wed Jun 15 16:44:18 2016 -0500
Committer: Imran Rashid <[email protected]>
Committed: Wed Jun 15 16:44:18 2016 -0500

----------------------------------------------------------------------
 .../spark/scheduler/SchedulerIntegrationSuite.scala  | 15 +++++++--------
 1 file changed, 7 insertions(+), 8 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/cafc696d/core/src/test/scala/org/apache/spark/scheduler/SchedulerIntegrationSuite.scala
----------------------------------------------------------------------
diff --git 
a/core/src/test/scala/org/apache/spark/scheduler/SchedulerIntegrationSuite.scala
 
b/core/src/test/scala/org/apache/spark/scheduler/SchedulerIntegrationSuite.scala
index 54b7312..12dfa56 100644
--- 
a/core/src/test/scala/org/apache/spark/scheduler/SchedulerIntegrationSuite.scala
+++ 
b/core/src/test/scala/org/apache/spark/scheduler/SchedulerIntegrationSuite.scala
@@ -518,10 +518,11 @@ class BasicSchedulerIntegrationSuite extends 
SchedulerIntegrationSuite[SingleCor
 
       // make sure the required map output is available
       task.stageId match {
-        case 1 => assertMapOutputAvailable(b)
-        case 3 => assertMapOutputAvailable(c)
         case 4 => assertMapOutputAvailable(d)
-        case _ => // no shuffle map input, nothing to check
+        case _ =>
+        // we can't check for the output for the two intermediate stages, 
unfortunately,
+        // b/c the stage numbering is non-deterministic, so stage number alone 
doesn't tell
+        // us what to check
       }
 
       (task.stageId, task.stageAttemptId, task.partitionId) match {
@@ -557,11 +558,9 @@ class BasicSchedulerIntegrationSuite extends 
SchedulerIntegrationSuite[SingleCor
       val (taskDescription, task) = backend.beginTask()
       stageToAttempts.getOrElseUpdate(task.stageId, new HashSet()) += 
task.stageAttemptId
 
-      // make sure the required map output is available
-      task.stageId match {
-        case 1 => assertMapOutputAvailable(shuffledRdd)
-        case _ => // no shuffle map input, nothing to check
-      }
+      // We cannot check if shuffle output is available, because the failed 
fetch will clear the
+      // shuffle output.  Then we'd have a race, between the already-started 
task from the first
+      // attempt, and when the failure clears out the map output status.
 
       (task.stageId, task.stageAttemptId, task.partitionId) match {
         case (0, _, _) =>


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to