[GitHub] spark pull request: [SPARK-3411]Optimize the schedule procedure in...

JoshRosen Fri, 05 Sep 2014 13:00:58 -0700

Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/1106#issuecomment-54674029
  
    Yeah, I suppose so, but there was one corner-case that I was concerned 
about (that is addressed by treating it as a circular buffer):
    
    Let's say we have a scenario where we are trying to schedule three drivers 
but there are only two workers that they can run on, and let's also say that 
there's initially enough capacity to run all three drivers.  In that case, I 
think we would pop the head of `shuffleWorkers` until it becomes empty and not 
schedule the third driver.  This other driver would get scheduled on a 
subsequent call to `Master.schedule()`, but I guess that only happens when new 
apps join or when resource availability changes, so we might wait longer than 
is necessary to launch this driver.
    
    The circular buffer brings its own problems, though: let's say that there 
are _no_ valid locations where we can schedule the driver.  In this case, we 
should stop looping through the buffer so that we don't go into an infinite 
loop.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [SPARK-3411]Optimize the schedule procedure in...

Reply via email to