[GitHub] spark pull request: [SPARK-6954] [YARN] Dynamic allocation: numExe...

andrewor14 Tue, 21 Apr 2015 13:14:57 -0700

Github user andrewor14 commented on the pull request:

    https://github.com/apache/spark/pull/5536#issuecomment-94924265
  
    Let's take a step back and summarize the problem and the proposed solution 
to make sure that we're all on the same page. This issue is actually fairly 
complicated so I think it's worth it to explore this in some detail. I would be 
happy to hear your thoughts about this. @srowen @piaozhexiu @sryza
    
    ------------
    
    **Symptom.** There are instances where `ExecutorAllocationManager` (EAM) 
requests a negative total number of executors from the `YarnAllocator`, and 
this results in an exception because the the code correctly does not allow 
this. I think we all agree on this part. Now let's trace through an example of 
how we can run into this exception.
    
    **Problem illustrated with an example.**
    
    (0) Let's say we start with 10 executors with nothing pending to be added 
or removed
    (1) Some tasks have finished such that the scheduler tells us we only need 
5.
    (2) Next time we call `schedule` we attempt to cancel 5 pending requests 
(but there are none)
    (3) We set `numExecutorsPending = newTarget - executorIds.size + [...] = 5 
- 10 + 0 = -5`
    (4) Next time we add executors our new target will be
    ```
    // Let's say `numExecutorsToAdd == 1`
    val currentTarget = numExecutorsPending + executorIds.size - [...] = -5 + 
10 - 0 = 5
    val newTarget = math.min(currentTarget + numExecutorsToAdd, [...]) = 5 + 1 
= 6
    ```
    (5) Thus, we only request 6 executors when we're trying to add, even though 
we already have 10.
    
    Now imagine between steps (3) and (4), a number of executors fail (or are 
removed, the distinction is not important). For instance, if all of the 
existing executors fail, *then we will end up with a negative target* (-4 in 
this case, computed using the same equation above), resulting in the exception. 
This is consistent with the description in the 
[JIRA](https://issues.apache.org/jira/browse/SPARK-6954) that this only happens 
if we remove executors too quickly.
    
    **Underlying cause.** After a cancel, the EAM unconditionally adjusts the 
new target downwards whether or not there are pending requests to be canceled, 
and then computes `numExecutorsPending` using this new target. If there are no 
pending executors, then the allocator actually *ignores* the cancel request 
because there is nothing to cancel. Meanwhile, the EAM expects the new total 
number of executors to go down as well, but this never happens.
    
    In other words, when we cancel, we should only adjust the new target 
downwards by at most `numExecutorsPending`, since the allocator will only 
cancel any pending requests but not kill existing executors.
    
    **Proposed solution.** This patch ensures that `numExecutorsPending` never 
goes below 0. This means the next time EAM tries to add executors, it will use 
a target total that is at least the number of currently registered executors, 
and this must be >= 0.
    
    This behavior also matches the state on the allocator side in most cases. 
When it receives a cancel request, the allocator will cancel at most the number 
of pending requests that are outstanding (clearly this cannot be negative).
    
    Note that even this solution is not perfect. There exists a race condition 
in which a few of the pending requests we are trying to cancel may already be 
granted by the time the cancel request reaches the allocator. In this case, the 
EAM will get more executors than is needed, but that is fine because they will 
eventually be removed anyway. It's worth noting that this whole cancel feature 
is best-effort in the first place, so it's OK for the cancel to not always take 
effect, but it's not OK for the cancel to cause the add to fail.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [SPARK-6954] [YARN] Dynamic allocation: numExe...

Reply via email to