Github user sryza commented on the pull request:

    https://github.com/apache/spark/pull/2746#issuecomment-59853416
  
    Everything sounds good except for a couple specific callouts below:
    
    > we shouldn't wait for the new ones to register before asking for more.
    This is still worrying to me.  My concern is that Spark will try to grab 
way more executors than it needs if YARN suddenly makes resources available to 
it in the middle of a long stage.  Certainly a user can avoid this by setting a 
max, but places an additional configuration burden that I think is both 
avoidable and difficult to reason about for inexperienced users.  Sorry to 
continually be so picky about this - my main concern about this stuff is that 
similar frameworks like MR and Tez essentially provide full resource elasticity 
with zero attention to configuration required on behalf of the user.  Their 
elasticity of course comes at the price of incurring more JVM startups, but I 
think additional complexity is worthwhile if it can get Spark to achieve parity.
    
    > To simplify the timer logic, we will make the variables hold the 
expiration times of the timers instead of counters that are reset to 0 every 
time the timers trigger.
    What's the reasoning behind using Timers vs. a polling approach?  I think 
the latter is still a fair bit easier to understand if all else is equal.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to