[GitHub] spark issue #19194: [SPARK-20589] Allow limiting task concurrency per stage

squito Thu, 28 Sep 2017 07:22:46 -0700

Github user squito commented on the issue:

    https://github.com/apache/spark/pull/19194
  
    Sorry if I was unclear earlier on the issue w/ the active Job ID.  So, I 
agree, that if a user actually gets into this situation, where they've got two 
different jobs for the same stage, with different max concurrent tasks, its 
mostly a toss-up which one they'll get, as the users jobs are probably racing 
to get to that stage.  Still, I think its important that it pulls the max 
concurrent tasks from the active job, just so that users can understand what is 
going on and for consistency and debugability.  The TaskSetManager gets the 
property from the active job, which actually submitted the stage, so the 
ExecutorAllocationManager should do the same.
    
    I think the best way to ensure that is to add activeJobId to 
SparkListenerStageSubmitted.  Then you'd go back to just keeping a 
jobIdToMaxConcurrentTasks map when handling onJobStart, and in 
onStageSubmitted, you'd then figure out the max number of tasks for that stage, 
given the job which actually submitted it.
    
    @tgravescs what do you think?



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #19194: [SPARK-20589] Allow limiting task concurrency per stage

Reply via email to