[GitHub] spark issue #17854: [SPARK-20564][Deploy] Reduce massive executor failures w...

squito Fri, 05 May 2017 07:51:41 -0700

Github user squito commented on the issue:

    https://github.com/apache/spark/pull/17854
  
    It looks to me like this is actually making 2 behavior changes:
    
    1) throttle the requests for new containers, as you describe in your 
description
    2) drop newly received containers if they are over the limit (not in the 
description).
    
    is that correct?  Did you find that (2) was necessary as well? 
    
    I understand the problem you are describing, but I'm surprised this really 
helps the driver scale up to more executors.  Maybe this will let the executors 
start, but won't it just lead to the driver getting swamped when you've got 
2500 executors sending heartbeats and task updates?  I'm not saying its bad to 
make this improvement, just trying to understand.  I'd feel better about just 
doing (1) -- if you found (2) is necessary, I would want to think through the 
implications a bit more.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #17854: [SPARK-20564][Deploy] Reduce massive executor failures w...

Reply via email to