Github user tgravescs commented on a diff in the pull request:
https://github.com/apache/spark/pull/19194#discussion_r140332294
--- Diff:
core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala ---
@@ -619,6 +625,47 @@ private[spark] class ExecutorAllocationManager(
// place the executors.
private val stageIdToExecutorPlacementHints = new mutable.HashMap[Int,
(Int, Map[String, Int])]
+ override def onJobStart(jobStart: SparkListenerJobStart): Unit = {
+ jobStart.stageInfos.foreach(stageInfo =>
stageIdToJobId(stageInfo.stageId) = jobStart.jobId)
+
+ var jobGroupId = if (jobStart.properties != null) {
+ jobStart.properties.getProperty(SparkContext.SPARK_JOB_GROUP_ID)
+ } else {
+ null
+ }
+
+ val maxConTasks = if (jobGroupId != null &&
+ conf.contains(s"spark.job.$jobGroupId.maxConcurrentTasks")) {
+ conf.get(s"spark.job.$jobGroupId.maxConcurrentTasks").toInt
+ } else {
+ Int.MaxValue
+ }
+
+ if (maxConTasks <= 0) {
+ throw new IllegalArgumentException(
+ "Maximum Concurrent Tasks should be set greater than 0 for the
job to progress.")
+ }
+
+ if (jobGroupId == null ||
!conf.contains(s"spark.job.$jobGroupId.maxConcurrentTasks")) {
+ jobGroupId = DEFAULT_JOB_GROUP
+ }
+
+ jobIdToJobGroup(jobStart.jobId) = jobGroupId
+ if (!jobGroupToMaxConTasks.contains(jobGroupId)) {
--- End diff --
If we are talking jobs within the same job groups, it seems like this
would be very timing dependent as to what number you would get if you start
allowing it to be changed real time. Lets say you have 1 thread and set the
job group. Now if all the jobs within that group are launched serial then
everything is easy, allowing it to be changed can make sense. But if from that
thread you spawn other threads to launch jobs in parallel (which would still be
in that same job group) and each of those is setting it differently, how do you
know you will get the right number for each of those jobs? the 2 threads
could race to set the conf and if both set it right before launching you are
going to get one of the settings for both launches whereas one might have
expected a different setting.
@squito does this cover the scenario you are referring to?
while both of those cases might be rare, I would lean towards making sure
its more predictable and only setting it once rather then having user get
something they don't expect. But either could probably be documented away if
we see the serial type scenario being more beneficial.
ideally it would be nice to set at the stage level but that is a lot more
difficult.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]