tgravescs commented on a change in pull request #28257:
URL: https://github.com/apache/spark/pull/28257#discussion_r411430660
##########
File path:
core/src/test/scala/org/apache/spark/scheduler/BarrierTaskContextSuite.scala
##########
@@ -276,4 +276,20 @@ class BarrierTaskContextSuite extends SparkFunSuite with
LocalSparkContext {
initLocalClusterSparkContext()
testBarrierTaskKilled(interruptOnKill = true)
}
+
+ test("SPARK-31485: barrier stage should fail if only partial tasks are
launched") {
+ initLocalClusterSparkContext(2)
+ val rdd0 = sc.parallelize(Seq(0, 1, 2, 3), 2)
+ val dep = new OneToOneDependency[Int](rdd0)
+ // set up a barrier stage with 2 tasks and both tasks prefer executor 0
(only 1 core) for
Review comment:
ok looking through some of the logic some more, I guess if total slots
is less then it skips it higher up. This still seems very odd that you fail
based on locality being set. Wouldn't you just want to ignore locality in this
case? Dynamic allocation isn't on, if you have number slots = number of tasks
you need to just schedule it.
Do we at least recommend people turning off locality when using barrier? I
recommend shutting off current to most people anyway on yarn because it can
have other bad issues. Do we have a jira for this issue, this seems like it
could be very confusing to users. The message says something about
blacklisted, so if I'm going to try to debug to figure out why my stuff isn't
scheduled, I think this would be very hard to figure out.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]