Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/21915#discussion_r206616882
--- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala
---
@@ -367,6 +368,25 @@ class DAGScheduler(
stage
}
+ /**
+ * We don't support run a barrier stage with dynamic resource allocation
enabled, it shall lead
+ * to some confusing behaviors (eg. with dynamic resource allocation
enabled, it may happen that
+ * we acquire some executors (but not enough to launch all the tasks in
a barrier stage) and
+ * later release them due to executor idle time expire, and then acquire
again).
+ *
+ * We perform the check on job submit and fail fast if running a barrier
stage with dynamic
+ * resource allocation enabled.
+ *
+ * TODO SPARK-24942 Improve cluster resource management with jobs
containing barrier stage
+ */
+ private def checkBarrierStageWithDynamicAllocation(rdd: RDD[_]): Unit = {
+ if (rdd.isBarrier() && Utils.isDynamicAllocationEnabled(sc.getConf)) {
+ throw new SparkException("Don't support run a barrier stage with
dynamic resource " +
--- End diff --
* `[SPARK-24942]: Barrier execution mode does not support dynamic resource
allocation for now. You can disable dynamic resource allocation by setting
Spark conf "spark.dynamicAllocation.enabled" to "false".`
* Make the error message a constant to simplify test.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]