Re: Re: flink on yarn任务启动报错 The assigned slot container_e10_1579661300080_0005_01_000002_0 was removed.

2020-01-28 文章 郑 洁锋
没有log,只有err和out,out为空


zjfpla...@hotmail.com

发件人: tison
发送时间: 2020-01-24 10:03
收件人: user-zh
抄送: zhisheng2018
主题: Re: Re: flink on yarn任务启动报错 The assigned slot 
container_e10_1579661300080_0005_01_02_0 was removed.
你上面的是 taskmanager.err,需要的是 taskmanager.log

Best,
tison.


郑 洁锋  于2020年1月23日周四 下午10:22写道:

> 之前挂过 后面启动的时候 是checkpoints的文件丢了? 你是这个意思吗?
>
> 
> zjfpla...@hotmail.com
>
> 发件人: zhisheng
> 发送时间: 2020-01-22 16:45
> 收件人: user-zh
> 主题: Re: flink on yarn任务启动报错 The assigned slot
> container_e10_1579661300080_0005_01_02_0 was removed.
> 应该是你作业之前挂过了
>
> 郑 洁锋  于2020年1月22日周三 上午11:16写道:
>
> > 大家好,
> >flink on yarn任务启动时,发现报错了The assigned slot
> > container_e10_1579661300080_0005_01_02_0 was removed.
> >环境:flink1.8.1,cdh5.14.2,kafka0.10,jdk1.8.0_241
> >
> > flink版本为1.8.1,yarn上的日志:
> >
> > 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:
> >
> 
> > 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  Starting
> > YarnJobClusterEntrypoint (Version: , Rev:7297bac,
> Date:24.06.2019
> > @ 23:04:28 CST)
> > 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  OS current user:
> > cloudera-scm
> > 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  Current
> > Hadoop/Kerberos user: root
> > 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  JVM: Java
> > HotSpot(TM) 64-Bit Server VM - Oracle Corporation - 1.8/25.241-b07
> > 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  Maximum heap size:
> > 406 MiBytes
> > 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  JAVA_HOME:
> > /usr/java/default
> > 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  Hadoop version:
> 2.6.5
> > 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  JVM Options:
> > 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint: -Xms424m
> > 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint: -Xmx424m
> > 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  Program Arguments:
> > (none)
> > 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  Classpath:
> >
> 

Re: Re: flink on yarn任务启动报错 The assigned slot container_e10_1579661300080_0005_01_000002_0 was removed.

2020-01-23 文章 tison
你上面的是 taskmanager.err,需要的是 taskmanager.log

Best,
tison.


郑 洁锋  于2020年1月23日周四 下午10:22写道:

> 之前挂过 后面启动的时候 是checkpoints的文件丢了? 你是这个意思吗?
>
> 
> zjfpla...@hotmail.com
>
> 发件人: zhisheng
> 发送时间: 2020-01-22 16:45
> 收件人: user-zh
> 主题: Re: flink on yarn任务启动报错 The assigned slot
> container_e10_1579661300080_0005_01_02_0 was removed.
> 应该是你作业之前挂过了
>
> 郑 洁锋  于2020年1月22日周三 上午11:16写道:
>
> > 大家好,
> >flink on yarn任务启动时,发现报错了The assigned slot
> > container_e10_1579661300080_0005_01_02_0 was removed.
> >环境:flink1.8.1,cdh5.14.2,kafka0.10,jdk1.8.0_241
> >
> > flink版本为1.8.1,yarn上的日志:
> >
> > 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:
> >
> 
> > 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  Starting
> > YarnJobClusterEntrypoint (Version: , Rev:7297bac,
> Date:24.06.2019
> > @ 23:04:28 CST)
> > 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  OS current user:
> > cloudera-scm
> > 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  Current
> > Hadoop/Kerberos user: root
> > 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  JVM: Java
> > HotSpot(TM) 64-Bit Server VM - Oracle Corporation - 1.8/25.241-b07
> > 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  Maximum heap size:
> > 406 MiBytes
> > 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  JAVA_HOME:
> > /usr/java/default
> > 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  Hadoop version:
> 2.6.5
> > 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  JVM Options:
> > 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint: -Xms424m
> > 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint: -Xmx424m
> > 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  Program Arguments:
> > (none)
> > 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  Classpath:
> >
> 

Re: Re: flink on yarn任务启动报错 The assigned slot container_e10_1579661300080_0005_01_000002_0 was removed.

2020-01-23 文章 郑 洁锋
之前挂过 后面启动的时候 是checkpoints的文件丢了? 你是这个意思吗?


zjfpla...@hotmail.com

发件人: zhisheng
发送时间: 2020-01-22 16:45
收件人: user-zh
主题: Re: flink on yarn任务启动报错 The assigned slot 
container_e10_1579661300080_0005_01_02_0 was removed.
应该是你作业之前挂过了

郑 洁锋  于2020年1月22日周三 上午11:16写道:

> 大家好,
>flink on yarn任务启动时,发现报错了The assigned slot
> container_e10_1579661300080_0005_01_02_0 was removed.
>环境:flink1.8.1,cdh5.14.2,kafka0.10,jdk1.8.0_241
>
> flink版本为1.8.1,yarn上的日志:
>
> 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:
> 
> 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  Starting
> YarnJobClusterEntrypoint (Version: , Rev:7297bac, Date:24.06.2019
> @ 23:04:28 CST)
> 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  OS current user:
> cloudera-scm
> 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  Current
> Hadoop/Kerberos user: root
> 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  JVM: Java
> HotSpot(TM) 64-Bit Server VM - Oracle Corporation - 1.8/25.241-b07
> 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  Maximum heap size:
> 406 MiBytes
> 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  JAVA_HOME:
> /usr/java/default
> 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  Hadoop version: 2.6.5
> 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  JVM Options:
> 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint: -Xms424m
> 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint: -Xmx424m
> 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  Program Arguments:
> (none)
> 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  Classpath:
> 

Re: Re: flink on yarn任务启动报错 The assigned slot container_e10_1579661300080_0005_01_000002_0 was removed.

2020-01-23 文章 郑 洁锋
日志已经在前面的邮件里面了


zjfpla...@hotmail.com

发件人: tison
发送时间: 2020-01-22 12:10
收件人: user-zh
主题: Re: Re: flink on yarn任务启动报错 The assigned slot 
container_e10_1579661300080_0005_01_02_0 was removed.
那你看下 TM 那台机器上的 TM 日志,从 JM 端来看 TM 曾经成功起来过并注册了自己,你看看 TM 是怎么挂的或者别的什么情况

Best,
tison.


郑 洁锋  于2020年1月22日周三 上午11:54写道:

> TM没有起来,服务器本身内存cpu都是够的,还很空闲
>
> 
> zjfpla...@hotmail.com
>
> 发件人: tison
> 发送时间: 2020-01-22 11:25
> 收件人: user-zh
> 主题: Re: flink on yarn任务启动报错 The assigned slot
> container_e10_1579661300080_0005_01_02_0 was removed.
> 20/01/22 11:08:49 INFO yarn.YarnResourceManager: Closing TaskExecutor
> connection container_e10_1579661300080_0005_01_02 because: The
> heartbeat of TaskManager with id container_e10_1579661300080_0005_01_02
> timed out.
>
> 你请求资源的时候把 slot 请求发到这台机器上了,然后它心跳超时了,你看看 TM 有没有正常起来,有没有资源不够或者挂了
>
> Best,
> tison.
>
>
> 郑 洁锋  于2020年1月22日周三 上午11:16写道:
>
> > 大家好,
> >flink on yarn任务启动时,发现报错了The assigned slot
> > container_e10_1579661300080_0005_01_02_0 was removed.
> >环境:flink1.8.1,cdh5.14.2,kafka0.10,jdk1.8.0_241
> >
> > flink版本为1.8.1,yarn上的日志:
> >
> > 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:
> >
> 
> > 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  Starting
> > YarnJobClusterEntrypoint (Version: , Rev:7297bac,
> Date:24.06.2019
> > @ 23:04:28 CST)
> > 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  OS current user:
> > cloudera-scm
> > 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  Current
> > Hadoop/Kerberos user: root
> > 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  JVM: Java
> > HotSpot(TM) 64-Bit Server VM - Oracle Corporation - 1.8/25.241-b07
> > 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  Maximum heap size:
> > 406 MiBytes
> > 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  JAVA_HOME:
> > /usr/java/default
> > 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  Hadoop version:
> 2.6.5
> > 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  JVM Options:
> > 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint: -Xms424m
> > 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint: -Xmx424m
> > 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  Program Arguments:
> > (none)
> > 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  Classpath:
> >
> 

Re: flink on yarn任务启动报错 The assigned slot container_e10_1579661300080_0005_01_000002_0 was removed.

2020-01-22 文章 zhisheng
应该是你作业之前挂过了

郑 洁锋  于2020年1月22日周三 上午11:16写道:

> 大家好,
>flink on yarn任务启动时,发现报错了The assigned slot
> container_e10_1579661300080_0005_01_02_0 was removed.
>环境:flink1.8.1,cdh5.14.2,kafka0.10,jdk1.8.0_241
>
> flink版本为1.8.1,yarn上的日志:
>
> 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:
> 
> 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  Starting
> YarnJobClusterEntrypoint (Version: , Rev:7297bac, Date:24.06.2019
> @ 23:04:28 CST)
> 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  OS current user:
> cloudera-scm
> 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  Current
> Hadoop/Kerberos user: root
> 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  JVM: Java
> HotSpot(TM) 64-Bit Server VM - Oracle Corporation - 1.8/25.241-b07
> 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  Maximum heap size:
> 406 MiBytes
> 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  JAVA_HOME:
> /usr/java/default
> 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  Hadoop version: 2.6.5
> 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  JVM Options:
> 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint: -Xms424m
> 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint: -Xmx424m
> 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  Program Arguments:
> (none)
> 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  Classpath:
> 

Re: Re: flink on yarn任务启动报错 The assigned slot container_e10_1579661300080_0005_01_000002_0 was removed.

2020-01-21 文章 tison
那你看下 TM 那台机器上的 TM 日志,从 JM 端来看 TM 曾经成功起来过并注册了自己,你看看 TM 是怎么挂的或者别的什么情况

Best,
tison.


郑 洁锋  于2020年1月22日周三 上午11:54写道:

> TM没有起来,服务器本身内存cpu都是够的,还很空闲
>
> 
> zjfpla...@hotmail.com
>
> 发件人: tison
> 发送时间: 2020-01-22 11:25
> 收件人: user-zh
> 主题: Re: flink on yarn任务启动报错 The assigned slot
> container_e10_1579661300080_0005_01_02_0 was removed.
> 20/01/22 11:08:49 INFO yarn.YarnResourceManager: Closing TaskExecutor
> connection container_e10_1579661300080_0005_01_02 because: The
> heartbeat of TaskManager with id container_e10_1579661300080_0005_01_02
> timed out.
>
> 你请求资源的时候把 slot 请求发到这台机器上了,然后它心跳超时了,你看看 TM 有没有正常起来,有没有资源不够或者挂了
>
> Best,
> tison.
>
>
> 郑 洁锋  于2020年1月22日周三 上午11:16写道:
>
> > 大家好,
> >flink on yarn任务启动时,发现报错了The assigned slot
> > container_e10_1579661300080_0005_01_02_0 was removed.
> >环境:flink1.8.1,cdh5.14.2,kafka0.10,jdk1.8.0_241
> >
> > flink版本为1.8.1,yarn上的日志:
> >
> > 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:
> >
> 
> > 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  Starting
> > YarnJobClusterEntrypoint (Version: , Rev:7297bac,
> Date:24.06.2019
> > @ 23:04:28 CST)
> > 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  OS current user:
> > cloudera-scm
> > 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  Current
> > Hadoop/Kerberos user: root
> > 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  JVM: Java
> > HotSpot(TM) 64-Bit Server VM - Oracle Corporation - 1.8/25.241-b07
> > 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  Maximum heap size:
> > 406 MiBytes
> > 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  JAVA_HOME:
> > /usr/java/default
> > 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  Hadoop version:
> 2.6.5
> > 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  JVM Options:
> > 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint: -Xms424m
> > 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint: -Xmx424m
> > 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  Program Arguments:
> > (none)
> > 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  Classpath:
> >
> 

Re: Re: flink on yarn任务启动报错 The assigned slot container_e10_1579661300080_0005_01_000002_0 was removed.

2020-01-21 文章 郑 洁锋
TM没有起来,服务器本身内存cpu都是够的,还很空闲


zjfpla...@hotmail.com

发件人: tison
发送时间: 2020-01-22 11:25
收件人: user-zh
主题: Re: flink on yarn任务启动报错 The assigned slot 
container_e10_1579661300080_0005_01_02_0 was removed.
20/01/22 11:08:49 INFO yarn.YarnResourceManager: Closing TaskExecutor
connection container_e10_1579661300080_0005_01_02 because: The
heartbeat of TaskManager with id container_e10_1579661300080_0005_01_02
timed out.

你请求资源的时候把 slot 请求发到这台机器上了,然后它心跳超时了,你看看 TM 有没有正常起来,有没有资源不够或者挂了

Best,
tison.


郑 洁锋  于2020年1月22日周三 上午11:16写道:

> 大家好,
>flink on yarn任务启动时,发现报错了The assigned slot
> container_e10_1579661300080_0005_01_02_0 was removed.
>环境:flink1.8.1,cdh5.14.2,kafka0.10,jdk1.8.0_241
>
> flink版本为1.8.1,yarn上的日志:
>
> 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:
> 
> 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  Starting
> YarnJobClusterEntrypoint (Version: , Rev:7297bac, Date:24.06.2019
> @ 23:04:28 CST)
> 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  OS current user:
> cloudera-scm
> 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  Current
> Hadoop/Kerberos user: root
> 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  JVM: Java
> HotSpot(TM) 64-Bit Server VM - Oracle Corporation - 1.8/25.241-b07
> 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  Maximum heap size:
> 406 MiBytes
> 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  JAVA_HOME:
> /usr/java/default
> 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  Hadoop version: 2.6.5
> 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  JVM Options:
> 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint: -Xms424m
> 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint: -Xmx424m
> 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  Program Arguments:
> (none)
> 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  Classpath:
> 

Re: flink on yarn任务启动报错 The assigned slot container_e10_1579661300080_0005_01_000002_0 was removed.

2020-01-21 文章 tison
20/01/22 11:08:49 INFO yarn.YarnResourceManager: Closing TaskExecutor
connection container_e10_1579661300080_0005_01_02 because: The
heartbeat of TaskManager with id container_e10_1579661300080_0005_01_02
timed out.

你请求资源的时候把 slot 请求发到这台机器上了,然后它心跳超时了,你看看 TM 有没有正常起来,有没有资源不够或者挂了

Best,
tison.


郑 洁锋  于2020年1月22日周三 上午11:16写道:

> 大家好,
>flink on yarn任务启动时,发现报错了The assigned slot
> container_e10_1579661300080_0005_01_02_0 was removed.
>环境:flink1.8.1,cdh5.14.2,kafka0.10,jdk1.8.0_241
>
> flink版本为1.8.1,yarn上的日志:
>
> 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:
> 
> 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  Starting
> YarnJobClusterEntrypoint (Version: , Rev:7297bac, Date:24.06.2019
> @ 23:04:28 CST)
> 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  OS current user:
> cloudera-scm
> 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  Current
> Hadoop/Kerberos user: root
> 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  JVM: Java
> HotSpot(TM) 64-Bit Server VM - Oracle Corporation - 1.8/25.241-b07
> 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  Maximum heap size:
> 406 MiBytes
> 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  JAVA_HOME:
> /usr/java/default
> 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  Hadoop version: 2.6.5
> 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  JVM Options:
> 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint: -Xms424m
> 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint: -Xmx424m
> 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  Program Arguments:
> (none)
> 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:  Classpath:
>