Github user tgravescs commented on the pull request:
https://github.com/apache/spark/pull/6082#issuecomment-101854622
I agree it should be automated or have reasonable default (5 is definitely
high), but I think there are so many different configs and possible setups that
is very hard to do based on the current yarn RM. How fast you get containers
is factor of to many things - nm heartbeats, size of cluster, load on cluster,
etc.
We could just change default to 1 second like MR and have folks like myself
change our default to be higher for our clusters.
I'm not necessarily against this, my question is how useful is it in the
current form or could it be better? For instance, have we run any tests wit
this? Does using the backoff differ any then having 2 configs and setting one
1 second and the other (when I don't need containers) to 3 or 5 seconds? The
difference there is in 1400ms I heartbeat in 3 times compared to 1 time in 1
second. How does that affect the overall job time? In this case I'm assuming
we are mostly concerned with small jobs as a few seconds in large jobs
shouldn't show up. It would be nice to see some numbers on that. If my job is
delayed even a few seconds then this doesn't have any affect at all. So if I
run a simple test being second in the queue where the first app is asking for
containers I assume this does nothing? Could it better having the 2 configs
where it heartbeats faster when I need containers but still doesn't overwhelm
RM.
If there is a difference great, lets go with this, if not is it really
necessarily or is there a better option. Then the question comes down to RM
load. If we do end up heartbeating in every 1 second does that hurt the RM.
This again is going to be very cluster dependent. I would guess on most small
and medium clusters its fine. Folks with larger clusters can configure it up
slightly.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]