Github user squito commented on a diff in the pull request:
https://github.com/apache/spark/pull/6291#discussion_r36262816
--- Diff:
core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala ---
@@ -749,6 +752,43 @@ class DAGSchedulerSuite
assertDataStructuresEmpty()
}
+ /**
+ * Makes sure that tasks for a stage used by multiple jobs are submitted
with the properties of a
+ * later, active job if they were previously run under a job that is no
longer active
+ */
+ test("stage used by two jobs, the first no longer active") {
+ val baseRdd = new MyRDD(sc, 1, Nil)
+ val finalRdd1 = new MyRDD(sc, 1, List(new OneToOneDependency(baseRdd)))
+ val finalRdd2 = new MyRDD(sc, 1, List(new OneToOneDependency(baseRdd)))
--- End diff --
is the problem that w/ a OnetoOneDependency, there isn't actually a shared
stage? I think there will be a shared RDD, but the stages are still separate.
I'm thinking you need a test where you have a shared ShuffleDependency.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]