Github user squito commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18950#discussion_r133499828
  
    --- Diff: 
core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala ---
    @@ -602,6 +604,21 @@ private[spark] class ExecutorAllocationManager(
         // place the executors.
         private val stageIdToExecutorPlacementHints = new mutable.HashMap[Int, 
(Int, Map[String, Int])]
     
    +    override def onJobStart(jobStart: SparkListenerJobStart): Unit = {
    +      val jobGroupId = if (jobStart.properties != null) {
    +        jobStart.properties.getProperty(SparkContext.SPARK_JOB_GROUP_ID)
    +      } else {
    +        ""
    +      }
    +      val maxConcurrentTasks = 
conf.getInt(s"spark.job.$jobGroupId.maxConcurrentTasks",
    +        Int.MaxValue)
    +
    +      logInfo(s"Setting maximum concurrent tasks for group: ${jobGroupId} 
to $maxConcurrentTasks")
    +      allocationManager.synchronized {
    +        allocationManager.maxConcurrentTasks = maxConcurrentTasks
    --- End diff --
    
    yeah mark is right.  after all, that is what separates a job group property 
from a global property for the entire spark context.
    
    I see why this is desirable for the most common case, of just running one 
job at a time, but to get this to work with multiple concurrent jobs (& job 
groups) you need to track a map from jobGroup -> maxConcurrency, and then sum 
that up (handling overflow for Int.MaxValue)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to