Hi, “Queue's AM resource limit exceeded” -> 这个应该是 YARN 对 AM 的使用资源进行了限制吧,上限是 4096M 内存?你启动的应该是 job mode 吧,每个 job 都会启动单独的 AM,每个 AM 占用 2048M 内存?如果按这样算的话确实只够启动两个
1900 <[email protected]> 于2019年4月4日周四 下午4:54写道: > 目前整体采用flink on yarn ha 部署,flink版本为社区版1.7.2,hadoop版本为社区版2.8.5 > > > 目前总共有5台flink集群,每台服务器CPU4核,内存8G > > > flink基本配置为 > jobmanager.heap.size: 2048m > taskmanager.heap.size: 2048m > taskmanager.numberOfTaskSlots: 4 > > > 采用run a job on flink 启动任务,现在每个任务一个并行度 > 命令如 flink run -d -m yarn-cluster ... > > > 当发布两个任务成功后,第三个任务就启动不了 > 部分启动日志如下 > 360 INFO org.apache.flink.yarn.AbstractYarnClusterDescriptor - > Submitting application master application_1554100483755_0013 > 2019-04-04 16:24:23,389 INFO > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted > application application_1554100483755_0013 > 2019-04-04 16:24:23,389 INFO > org.apache.flink.yarn.AbstractYarnClusterDescriptor - Waiting for > the cluster to be allocated > 2019-04-04 16:24:23,390 INFO > org.apache.flink.yarn.AbstractYarnClusterDescriptor - Deploying > cluster, current state ACCEPTED > 2019-04-04 16:25:23,625 INFO > org.apache.flink.yarn.AbstractYarnClusterDescriptor - Deployment > took more than 60 seconds. Please check if the requested resources are > available in the YARN cluster > 2019-04-04 16:25:23,876 INFO > org.apache.flink.yarn.AbstractYarnClusterDescriptor - Deployment > took more than 60 seconds. Please check if the requested resources are > available in the YARN cluster > 2019-04-04 16:25:24,127 INFO > org.apache.flink.yarn.AbstractYarnClusterDescriptor - Deployment > took more than 60 seconds. Please check if the requested resources are > available in the YARN cluster > 2019-04-04 16:25:24,378 INFO > org.apache.flink.yarn.AbstractYarnClusterDescriptor - Deployment > took more than 60 seconds. Please check if the requested resources are > available in the YARN cluster > > > > 其他找不到任何跟踪信息,查看yarn 控台后,发现容器分配不了,页面上的信息如下 > YarnApplicationState: ACCEPTED: waiting for AM container to be > allocated, launched and register with RM. > > > Diagnostics: [Thu Apr 04 16:33:49 +0800 2019] Application is added to > the scheduler and is not yet activated. > Queue's AM resource limit exceeded. Details : AM Partition = > <DEFAULT_PARTITION>; AM Resource Request = <memory:2048, vCores:1>; > Queue Resource Limit for AM = <memory:4096, vCores:1>; User AM Resource > Limit of the queue = <memory:4096, vCores:1>; Queue AM Resource Usage = > <memory:4096, vCores:2>; > > > > 1.按照上面的机器划分跟启动设置并行度,还有yarn控台节点查看,还有很多内存跟CPU没有使用到, > 为什么会出现这种情况,是还需要什么配置吗? > > 2.对于上面几个基本配置,jobmanager.heap.size,taskmanager.heap.size,taskmanager.numberOfTaskSlots有什么设置注意点吗? > 一般要怎么设置?我现在发现这种启动模式下,每个任务都会有一个jobmanager跟一个taskmanger
