Github user tgravescs commented on the pull request:

    https://github.com/apache/spark/pull/6082#issuecomment-101854622
  
    I agree it should be automated or have reasonable default (5 is definitely 
high), but I think there are so many different configs and possible setups that 
is very hard to do based on the current yarn RM.  How fast you get containers 
is factor of to many things - nm heartbeats, size of cluster, load on cluster, 
etc.
    We could just change default to 1 second like MR and have folks like myself 
change our default to be higher for our clusters.  
    I'm not necessarily against this, my question is how useful is it in the 
current form or could it be better?  For instance, have we run any tests wit 
this?  Does using the backoff differ any then having 2 configs and setting one 
1 second and the other (when I don't need containers) to 3 or 5 seconds?  The 
difference there is in 1400ms I heartbeat in 3 times compared to 1 time in 1 
second.   How does that affect the overall job time?  In this case I'm assuming 
we are mostly concerned with small jobs as a few seconds in large jobs 
shouldn't show up.  It would be nice to see some numbers on that.  If my job is 
delayed even a few seconds then this doesn't have any affect at all.  So if I 
run a simple test being second in the queue where the first app is asking for 
containers I assume this does nothing?     Could it better having the 2 configs 
where it heartbeats faster when I need containers but still doesn't overwhelm 
RM.
    
    If there is a difference great, lets go with this, if not is it really 
necessarily or is there a better option.   Then the question comes down to RM 
load.  If we do end up heartbeating in every 1 second does that hurt the RM. 
This again is going to be very cluster dependent.  I would guess on most small 
and medium clusters its fine.   Folks with larger clusters can configure it up 
slightly.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to