Re: [PR] YARN-7327: Enable asynchronous scheduling by default for capacity scheduler [hadoop]

via GitHub Tue, 10 Dec 2024 04:32:16 -0800


shameersss1 commented on PR #7138:
URL: https://github.com/apache/hadoop/pull/7138#issuecomment-2531512162

@brumi1024 - Thanks for looking into this.

> what is the reason behind changing the default of this setting?

1. The current default scheduling mechanism is synchronous (node-heart
driven) which is not efficient when there are large number of containers to be
allocated.
2. It also has additional issues like scheduling won't happen if there is
node-heartbeat loss due to network issue .
3. @wangdatan did an amazing job of making the async scheudling production
ready : Refer
https://issues.apache.org/jira/browse/YARN-7327?focusedCommentId=16205259&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16205259
for benchmark details.
4. The above benchmark shows async scheudling throughput is better than sync
scheduling

And hence the proposal here is to change the default scheduling stratergy
for capacity scheduler from synchronous to asynchronous. Already companies like
Alibaba cloud use this in their production
https://www.alibabacloud.com/help/en/emr/emr-on-ecs/user-guide/yarn-schedulers

@brumi1024 - Do you think is there any blocker/issue in enabling it by
default ?

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] YARN-7327: Enable asynchronous scheduling by default for capacity scheduler [hadoop]

Reply via email to