[
https://issues.apache.org/jira/browse/FLINK-5542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Gary Yao updated FLINK-5542:
----------------------------
Fix Version/s: 1.5.5
1.6.2
> YARN client incorrectly uses local YARN config to check vcore capacity
> ----------------------------------------------------------------------
>
> Key: FLINK-5542
> URL: https://issues.apache.org/jira/browse/FLINK-5542
> Project: Flink
> Issue Type: Bug
> Components: YARN
> Affects Versions: 1.1.4, 1.5.3, 1.6.0, 1.7.0
> Reporter: Shannon Carey
> Priority: Major
> Labels: pull-request-available
> Fix For: 1.7.0, 1.6.2, 1.5.5
>
>
> See
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/1-1-4-on-YARN-vcores-change-td11016.html
> When using bin/yarn-session.sh, AbstractYarnClusterDescriptor line 271 in
> 1.1.4 is comparing the user's selected number of vcores to the vcores
> configured in the local node's YARN config (from YarnConfiguration eg.
> yarn-site.xml and yarn-default.xml). It incorrectly prevents Flink from
> launching even if there is sufficient vcore capacity on the cluster.
> That is not correct, because the application will not necessarily run on the
> local node. For example, if running the yarn-session.sh client from the AWS
> EMR master node, the vcore count there may be different from the vcore count
> on the core nodes where Flink will actually run.
> A reasonable way to fix this would probably be to reuse the logic from
> "yarn-session.sh -q" (FlinkYarnSessionCli line 550) which knows how to get
> vcore information from the real worker nodes. Alternatively, perhaps we
> could remove the check entirely and rely on YARN's Scheduler to determine
> whether sufficient resources exist.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)