比如今天尝试了一波命令:./bin/yarn-session.sh -nm test_flink -q -qu upd_security -s 1
-tm 3024MB -jm 3024MB
同时我设置了 export HADOOP_USER_NAME=xxx
,这个在启动的时候会看到日志:org.apache.flink.runtime.security.modules.HadoopModule  -
Hadoop user set to upd_security (auth:SIMPLE)。

然后报错:

2020-08-24 10:52:31 ERROR org.apache.flink.yarn.cli.FlinkYarnSessionCli  -
Error while running the Flink session.
java.lang.RuntimeException: Couldn't get cluster description
        at
org.apache.flink.yarn.YarnClusterDescriptor.getClusterDescription(YarnClusterDescriptor.java:1254)
        at
org.apache.flink.yarn.cli.FlinkYarnSessionCli.run(FlinkYarnSessionCli.java:534)
        at
org.apache.flink.yarn.cli.FlinkYarnSessionCli.lambda$main$5(FlinkYarnSessionCli.java:785)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1754)
        at
org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
        at
org.apache.flink.yarn.cli.FlinkYarnSessionCli.main(FlinkYarnSessionCli.java:785)
Caused by: java.lang.NullPointerException: null
        at
org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getChildQueues(YarnClientImpl.java:587)
        at
org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getAllQueues(YarnClientImpl.java:557)
        at
org.apache.flink.yarn.YarnClusterDescriptor.getClusterDescription(YarnClusterDescriptor.java:1247)
        ... 7 common frames omitted

------------------------------------------------------------
 The program finished with the following exception:

java.lang.RuntimeException: Couldn't get cluster description
        at
org.apache.flink.yarn.YarnClusterDescriptor.getClusterDescription(YarnClusterDescriptor.java:1254)
        at
org.apache.flink.yarn.cli.FlinkYarnSessionCli.run(FlinkYarnSessionCli.java:534)
        at
org.apache.flink.yarn.cli.FlinkYarnSessionCli.lambda$main$5(FlinkYarnSessionCli.java:785)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1754)
        at
org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
        at
org.apache.flink.yarn.cli.FlinkYarnSessionCli.main(FlinkYarnSessionCli.java:785)
Caused by: java.lang.NullPointerException
        at
org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getChildQueues(YarnClientImpl.java:587)
        at
org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getAllQueues(YarnClientImpl.java:557)
        at
org.apache.flink.yarn.YarnClusterDescriptor.getClusterDescription(YarnClusterDescriptor.java:1247)
        ... 7 more





caozhen <caozhen1...@163.com> 于2020年8月24日周一 上午10:00写道:

> 报错是 AM申请资源时vcore不够
>
> 1、可以确认当前队列是否有足够的vcore
> 2、确认当前队列允许允许的最大application数
>
> 我之前遇到这个问题是队列没有配置好资源导致
>
>
>
> --
> Sent from: http://apache-flink.147419.n8.nabble.com/

回复