subject:"作业因为异常restart后，频繁OOM"

Re: 作业因为异常restart后，频繁OOM

2020-07-04 文章 Congxian Qiu

如果可以的话，在 OOM 的时候把整个进程的 memory dump 一份，然后分析看下是什么内存用的比预期多。

Best,
Congxian


SmileSmile  于2020年7月1日周三 下午12:49写道：

> 你的oom的详细报错是metaspace 不足还是被os kill？
>
>
>
>
> | |
> a511955993
> |
> |
> 邮箱：a511955...@163.com
> |
>
> 签名由 网易邮箱大师 定制
>
> 在2020年07月01日 11:32，kcz 写道：
> 1.10.0我也与遇到过，我看1.11.0介绍，会复用classloader，不知道是不是就把这个解决了。
> 我的情况是第一次运行OK，之后停止，再次启动，就遇到了OOM，调大了metaspace又可以跑，但是重复停止再次启动，还是OOM。
>
>
>
>
> --原始邮件--
> 发件人:"徐骁" 发送时间:2020年7月1日(星期三) 中午11:15
> 收件人:"user-zh"
> 主题:Re: 作业因为异常restart后，频繁OOM
>
>
>
> 很早以前遇到这个问题, standalone 模式下 metaspace 释放不掉, 感觉是一个比较严重的 bug
> https://issues.apache.org/jira/browse/FLINK-11205 这边有过讨论
>
> SmileSmile 
>  作业如果正常运行，堆外内存是足够的。在restart后才会出现频繁重启的情况，重构集群才能恢复正常
> 
> 
>  | |
>  a511955993
>  |
>  |
>  邮箱：a511955...@163.com
>  |
> 
>  签名由 网易邮箱大师 定制
> 
>  在2020年06月30日 23:39，LakeShen 写道：
>  我在较低版本，Flink on k8s ，也遇到 OOM 被 kill 了。
> 
>  我感觉可能是 TaskManager 堆外内存不足了，我目前是 Flink 1.6 版本，Flink on k8s ,
> standalone per
>  job 模式，堆外内存默认没有限制~。
> 
>  我的解决方法增加了一个参数：taskmanager.memory.off-heap: true.
> 
>  目前来看，OOM被 kill 掉的问题没有在出现了。希望能帮到你。
> 
>  Best,
>  LakeShen
> 
>  SmileSmile  
>  
>   补充一下，内核版本为 3.10.x，是否会是堆外内存cache没被回收而导致的内存超用？
>  
>  
>   | |
>   a511955993
>   |
>   |
>   邮箱：a511955...@163.com
>   |
>  
>   签名由 网易邮箱大师 定制
>  
>   在2020年06月30日 23:00，GuoSmileSmil 写道：
>   hi all，
>  
>  
>  
>  
> 
> 我使用的Flink版本为1.10.1，使用的backend是rocksdb，没有开启checkpoint，运行在kubernetes平台上，模式是standalone。
>  
>  
>  
> 
> 目前遇到的问题是作业如果因为网络抖动或者硬件故障导致的pod被失联而fail，在pod重生后，作业自动restart，作业运行一段时间（半小时到1小时不等）很容易出现其他pod因为oom被os
>   kill的现象，然后反复循环，pod
> 被kill越来越频繁。目前的解决方法是手动销毁这个集群，重新构建一个集群后重启作业，就恢复正常。
>  
>  
>   如果单纯heap的状态后台，作业restart不会出现这样的问题。
>  
>  
>   有一些不成熟的猜测，作业在fail后，native memory没有释放干净，pod的limit假设为10G，那么job
>   restart后只有8G，TM还是按照10G的标准运行，pod使用的内存就会超过10G而被os kill（纯属猜测）。
>  
>  
>   请问大家是否有什么好的提议或者解决方法？
>  
>  
>   其中一次系统内核日志如下：
>  
>  
>   Jun 30 21:59:15 flink-tm-1 kernel: memory: usage 28672000kB,
> limit
>   28672000kB, failcnt 11225
>   Jun 30 21:59:15 flink-tm-1 kernel: memory+swap: usage
> 28672000kB, limit
>   9007199254740988kB, failcnt 0
>   Jun 30 21:59:15 flink-tm-1 kernel: kmem: usage 0kB, limit
>   9007199254740988kB, failcnt 0
>   Jun 30 21:59:15 flink-tm-1 kernel: Memory cgroup stats for
>  
> /kubepods.slice/kubepods-pod5ad5d2ea_5faa_4a11_96b4_39271ab76e99.slice:
>   cache:0KB rss:0KB rss_huge:0KB mapped_file:0KB swap:0K
>   B inactive_anon:0KB active_anon:0KB inactive_file:0KB
> active_file:0KB
>   unevictable:0KB
>   Jun 30 21:59:15 flink-tm-1 kernel: Memory cgroup stats for
>  
> 
> /kubepods.slice/kubepods-pod5ad5d2ea_5faa_4a11_96b4_39271ab76e99.slice/docker-fe101418a3b2a7c534e89b4ac73d29b04070eb923220a5b1
>   7338850bbdb3817a.scope: cache:0KB rss:44KB rss_huge:0KB
> mapped_file:0KB
>   swap:0KB inactive_anon:0KB active_anon:44KB inactive_file:0KB
>   active_file:0KB unevictable:0KB
>   Jun 30 21:59:15 flink-tm-1 kernel: Memory cgroup stats for
>  
> 
> /kubepods.slice/kubepods-pod5ad5d2ea_5faa_4a11_96b4_39271ab76e99.slice/docker-a2295e812a828738810a8f1ae69cd48e99ef98b9e1038158a6e33f81524cc02a.scope:
>   cache:180KB rss:28671776KB rss_huge:26437632KB mapped_file:144KB
> swap:0KB
>   inactive_anon:0KB active_anon:28671760KB inactive_file:4KB
>  active_file:4KB
>   unevictable:0KB
>   Jun 30 21:59:15 flink-tm-1 kernel: [ pid ] uid
> tgid total_vm rss
>   nr_ptes swapents oom_score_adj name
>   Jun 30 21:59:15 flink-tm-1 kernel:
> [16875] 0 16875
> 253 1
>  
> 4
> 0 -998 pause
>   Jun 30 21:59:15 flink-tm-1 kernel:
> [17274] 0 17274
> 1369 421
>  
> 7
> 0 -998 bash
>   Jun 30 21:59:15 flink-tm-1 kernel:
> [18089] 0 18089 10824832 7174316
>   14500
> 0 -998 java
>   Jun 30 21:59:15 flink-tm-1 kernel:
> [18348] 0 18348
> 1017 196
>  
> 6
> 0 -998 tail
>   Jun 30 21:59:15 flink-tm-1 kernel: Memory cgroup out of memory:
> Kill
>   process 26824 (Window(Tumbling) score 4 or sacrifice child
>   Jun 30 21:59:15 flink-tm-1 kernel: Killed process 18089 (java)
>   total-vm:43299328kB, anon-rss:28669084kB, file-rss:28180kB,
> shmem-rss:0kB
>  
>  
>  
>  
>  
>  
>   Looking forward to your reply and help.
>  
>   Best
>

Re: 作业因为异常restart后，频繁OOM

2020-06-30 文章徐骁

很早以前遇到这个问题, standalone 模式下 metaspace 释放不掉, 感觉是一个比较严重的 bug
https://issues.apache.org/jira/browse/FLINK-11205 这边有过讨论

SmileSmile  于2020年6月30日周二 下午11:45写道：

> 作业如果正常运行，堆外内存是足够的。在restart后才会出现频繁重启的情况，重构集群才能恢复正常
>
>
> | |
> a511955993
> |
> |
> 邮箱：a511955...@163.com
> |
>
> 签名由 网易邮箱大师 定制
>
> 在2020年06月30日 23:39，LakeShen 写道：
> 我在较低版本，Flink on k8s ，也遇到 OOM 被 kill 了。
>
> 我感觉可能是 TaskManager 堆外内存不足了，我目前是 Flink 1.6 版本，Flink on k8s , standalone per
> job 模式，堆外内存默认没有限制~。
>
> 我的解决方法增加了一个参数：taskmanager.memory.off-heap: true.
>
> 目前来看，OOM被 kill 掉的问题没有在出现了。希望能帮到你。
>
> Best,
> LakeShen
>
> SmileSmile  于2020年6月30日周二 下午11:19写道：
>
> >
> > 补充一下，内核版本为 3.10.x，是否会是堆外内存cache没被回收而导致的内存超用？
> >
> >
> > | |
> > a511955993
> > |
> > |
> > 邮箱：a511955...@163.com
> > |
> >
> > 签名由 网易邮箱大师 定制
> >
> > 在2020年06月30日 23:00，GuoSmileSmil 写道：
> > hi all，
> >
> >
> >
> >
> 我使用的Flink版本为1.10.1，使用的backend是rocksdb，没有开启checkpoint，运行在kubernetes平台上，模式是standalone。
> >
> >
> >
> 目前遇到的问题是作业如果因为网络抖动或者硬件故障导致的pod被失联而fail，在pod重生后，作业自动restart，作业运行一段时间（半小时到1小时不等）很容易出现其他pod因为oom被os
> > kill的现象，然后反复循环，pod 被kill越来越频繁。目前的解决方法是手动销毁这个集群，重新构建一个集群后重启作业，就恢复正常。
> >
> >
> > 如果单纯heap的状态后台，作业restart不会出现这样的问题。
> >
> >
> > 有一些不成熟的猜测，作业在fail后，native memory没有释放干净，pod的limit假设为10G，那么job
> > restart后只有8G，TM还是按照10G的标准运行，pod使用的内存就会超过10G而被os kill（纯属猜测）。
> >
> >
> > 请问大家是否有什么好的提议或者解决方法？
> >
> >
> > 其中一次系统内核日志如下：
> >
> >
> > Jun 30 21:59:15 flink-tm-1 kernel: memory: usage 28672000kB, limit
> > 28672000kB, failcnt 11225
> > Jun 30 21:59:15 flink-tm-1 kernel: memory+swap: usage 28672000kB, limit
> > 9007199254740988kB, failcnt 0
> > Jun 30 21:59:15 flink-tm-1 kernel: kmem: usage 0kB, limit
> > 9007199254740988kB, failcnt 0
> > Jun 30 21:59:15 flink-tm-1 kernel: Memory cgroup stats for
> > /kubepods.slice/kubepods-pod5ad5d2ea_5faa_4a11_96b4_39271ab76e99.slice:
> > cache:0KB rss:0KB rss_huge:0KB mapped_file:0KB swap:0K
> > B inactive_anon:0KB active_anon:0KB inactive_file:0KB active_file:0KB
> > unevictable:0KB
> > Jun 30 21:59:15 flink-tm-1 kernel: Memory cgroup stats for
> >
> /kubepods.slice/kubepods-pod5ad5d2ea_5faa_4a11_96b4_39271ab76e99.slice/docker-fe101418a3b2a7c534e89b4ac73d29b04070eb923220a5b1
> > 7338850bbdb3817a.scope: cache:0KB rss:44KB rss_huge:0KB mapped_file:0KB
> > swap:0KB inactive_anon:0KB active_anon:44KB inactive_file:0KB
> > active_file:0KB unevictable:0KB
> > Jun 30 21:59:15 flink-tm-1 kernel: Memory cgroup stats for
> >
> /kubepods.slice/kubepods-pod5ad5d2ea_5faa_4a11_96b4_39271ab76e99.slice/docker-a2295e812a828738810a8f1ae69cd48e99ef98b9e1038158a6e33f81524cc02a.scope:
> > cache:180KB rss:28671776KB rss_huge:26437632KB mapped_file:144KB swap:0KB
> > inactive_anon:0KB active_anon:28671760KB inactive_file:4KB
> active_file:4KB
> > unevictable:0KB
> > Jun 30 21:59:15 flink-tm-1 kernel: [ pid ]   uid  tgid total_vm  rss
> > nr_ptes swapents oom_score_adj name
> > Jun 30 21:59:15 flink-tm-1 kernel: [16875] 0 16875  2531
> >  40  -998 pause
> > Jun 30 21:59:15 flink-tm-1 kernel: [17274] 0 17274 1369  421
> >  70  -998 bash
> > Jun 30 21:59:15 flink-tm-1 kernel: [18089] 0 18089 10824832  7174316
> >  145000  -998 java
> > Jun 30 21:59:15 flink-tm-1 kernel: [18348] 0 18348 1017  196
> >  60  -998 tail
> > Jun 30 21:59:15 flink-tm-1 kernel: Memory cgroup out of memory: Kill
> > process 26824 (Window(Tumbling) score 4 or sacrifice child
> > Jun 30 21:59:15 flink-tm-1 kernel: Killed process 18089 (java)
> > total-vm:43299328kB, anon-rss:28669084kB, file-rss:28180kB, shmem-rss:0kB
> >
> >
> >
> >
> >
> >
> > Looking forward to your reply and help.
> >
> > Best
>

回复：作业因为异常restart后，频繁OOM

2020-06-30 文章 SmileSmile

作业如果正常运行，堆外内存是足够的。在restart后才会出现频繁重启的情况，重构集群才能恢复正常


| |
a511955993
|
|
邮箱：a511955...@163.com
|

签名由 网易邮箱大师 定制

在2020年06月30日 23:39，LakeShen 写道：
我在较低版本，Flink on k8s ，也遇到 OOM 被 kill 了。

我感觉可能是 TaskManager 堆外内存不足了，我目前是 Flink 1.6 版本，Flink on k8s , standalone per
job 模式，堆外内存默认没有限制~。

我的解决方法增加了一个参数：taskmanager.memory.off-heap: true.

目前来看，OOM被 kill 掉的问题没有在出现了。希望能帮到你。

Best,
LakeShen

SmileSmile  于2020年6月30日周二 下午11:19写道：

>
> 补充一下，内核版本为 3.10.x，是否会是堆外内存cache没被回收而导致的内存超用？
>
>
> | |
> a511955993
> |
> |
> 邮箱：a511955...@163.com
> |
>
> 签名由 网易邮箱大师 定制
>
> 在2020年06月30日 23:00，GuoSmileSmil 写道：
> hi all，
>
>
>
> 我使用的Flink版本为1.10.1，使用的backend是rocksdb，没有开启checkpoint，运行在kubernetes平台上，模式是standalone。
>
>
> 目前遇到的问题是作业如果因为网络抖动或者硬件故障导致的pod被失联而fail，在pod重生后，作业自动restart，作业运行一段时间（半小时到1小时不等）很容易出现其他pod因为oom被os
> kill的现象，然后反复循环，pod 被kill越来越频繁。目前的解决方法是手动销毁这个集群，重新构建一个集群后重启作业，就恢复正常。
>
>
> 如果单纯heap的状态后台，作业restart不会出现这样的问题。
>
>
> 有一些不成熟的猜测，作业在fail后，native memory没有释放干净，pod的limit假设为10G，那么job
> restart后只有8G，TM还是按照10G的标准运行，pod使用的内存就会超过10G而被os kill（纯属猜测）。
>
>
> 请问大家是否有什么好的提议或者解决方法？
>
>
> 其中一次系统内核日志如下：
>
>
> Jun 30 21:59:15 flink-tm-1 kernel: memory: usage 28672000kB, limit
> 28672000kB, failcnt 11225
> Jun 30 21:59:15 flink-tm-1 kernel: memory+swap: usage 28672000kB, limit
> 9007199254740988kB, failcnt 0
> Jun 30 21:59:15 flink-tm-1 kernel: kmem: usage 0kB, limit
> 9007199254740988kB, failcnt 0
> Jun 30 21:59:15 flink-tm-1 kernel: Memory cgroup stats for
> /kubepods.slice/kubepods-pod5ad5d2ea_5faa_4a11_96b4_39271ab76e99.slice:
> cache:0KB rss:0KB rss_huge:0KB mapped_file:0KB swap:0K
> B inactive_anon:0KB active_anon:0KB inactive_file:0KB active_file:0KB
> unevictable:0KB
> Jun 30 21:59:15 flink-tm-1 kernel: Memory cgroup stats for
> /kubepods.slice/kubepods-pod5ad5d2ea_5faa_4a11_96b4_39271ab76e99.slice/docker-fe101418a3b2a7c534e89b4ac73d29b04070eb923220a5b1
> 7338850bbdb3817a.scope: cache:0KB rss:44KB rss_huge:0KB mapped_file:0KB
> swap:0KB inactive_anon:0KB active_anon:44KB inactive_file:0KB
> active_file:0KB unevictable:0KB
> Jun 30 21:59:15 flink-tm-1 kernel: Memory cgroup stats for
> /kubepods.slice/kubepods-pod5ad5d2ea_5faa_4a11_96b4_39271ab76e99.slice/docker-a2295e812a828738810a8f1ae69cd48e99ef98b9e1038158a6e33f81524cc02a.scope:
> cache:180KB rss:28671776KB rss_huge:26437632KB mapped_file:144KB swap:0KB
> inactive_anon:0KB active_anon:28671760KB inactive_file:4KB active_file:4KB
> unevictable:0KB
> Jun 30 21:59:15 flink-tm-1 kernel: [ pid ]   uid  tgid total_vm  rss
> nr_ptes swapents oom_score_adj name
> Jun 30 21:59:15 flink-tm-1 kernel: [16875] 0 16875  2531
>  40  -998 pause
> Jun 30 21:59:15 flink-tm-1 kernel: [17274] 0 17274 1369  421
>  70  -998 bash
> Jun 30 21:59:15 flink-tm-1 kernel: [18089] 0 18089 10824832  7174316
>  145000  -998 java
> Jun 30 21:59:15 flink-tm-1 kernel: [18348] 0 18348 1017  196
>  60  -998 tail
> Jun 30 21:59:15 flink-tm-1 kernel: Memory cgroup out of memory: Kill
> process 26824 (Window(Tumbling) score 4 or sacrifice child
> Jun 30 21:59:15 flink-tm-1 kernel: Killed process 18089 (java)
> total-vm:43299328kB, anon-rss:28669084kB, file-rss:28180kB, shmem-rss:0kB
>
>
>
>
>
>
> Looking forward to your reply and help.
>
> Best

Re: 作业因为异常restart后，频繁OOM

2020-06-30 文章 LakeShen

我在较低版本，Flink on k8s ，也遇到 OOM 被 kill 了。

我感觉可能是 TaskManager 堆外内存不足了，我目前是 Flink 1.6 版本，Flink on k8s , standalone per
job 模式，堆外内存默认没有限制~。

我的解决方法增加了一个参数：taskmanager.memory.off-heap: true.

目前来看，OOM被 kill 掉的问题没有在出现了。希望能帮到你。

Best,
LakeShen

SmileSmile  于2020年6月30日周二 下午11:19写道：

>
> 补充一下，内核版本为 3.10.x，是否会是堆外内存cache没被回收而导致的内存超用？
>
>
> | |
> a511955993
> |
> |
> 邮箱：a511955...@163.com
> |
>
> 签名由 网易邮箱大师 定制
>
> 在2020年06月30日 23:00，GuoSmileSmil 写道：
> hi all，
>
>
>
> 我使用的Flink版本为1.10.1，使用的backend是rocksdb，没有开启checkpoint，运行在kubernetes平台上，模式是standalone。
>
>
> 目前遇到的问题是作业如果因为网络抖动或者硬件故障导致的pod被失联而fail，在pod重生后，作业自动restart，作业运行一段时间（半小时到1小时不等）很容易出现其他pod因为oom被os
> kill的现象，然后反复循环，pod 被kill越来越频繁。目前的解决方法是手动销毁这个集群，重新构建一个集群后重启作业，就恢复正常。
>
>
> 如果单纯heap的状态后台，作业restart不会出现这样的问题。
>
>
> 有一些不成熟的猜测，作业在fail后，native memory没有释放干净，pod的limit假设为10G，那么job
> restart后只有8G，TM还是按照10G的标准运行，pod使用的内存就会超过10G而被os kill（纯属猜测）。
>
>
> 请问大家是否有什么好的提议或者解决方法？
>
>
> 其中一次系统内核日志如下：
>
>
> Jun 30 21:59:15 flink-tm-1 kernel: memory: usage 28672000kB, limit
> 28672000kB, failcnt 11225
> Jun 30 21:59:15 flink-tm-1 kernel: memory+swap: usage 28672000kB, limit
> 9007199254740988kB, failcnt 0
> Jun 30 21:59:15 flink-tm-1 kernel: kmem: usage 0kB, limit
> 9007199254740988kB, failcnt 0
> Jun 30 21:59:15 flink-tm-1 kernel: Memory cgroup stats for
> /kubepods.slice/kubepods-pod5ad5d2ea_5faa_4a11_96b4_39271ab76e99.slice:
> cache:0KB rss:0KB rss_huge:0KB mapped_file:0KB swap:0K
> B inactive_anon:0KB active_anon:0KB inactive_file:0KB active_file:0KB
> unevictable:0KB
> Jun 30 21:59:15 flink-tm-1 kernel: Memory cgroup stats for
> /kubepods.slice/kubepods-pod5ad5d2ea_5faa_4a11_96b4_39271ab76e99.slice/docker-fe101418a3b2a7c534e89b4ac73d29b04070eb923220a5b1
> 7338850bbdb3817a.scope: cache:0KB rss:44KB rss_huge:0KB mapped_file:0KB
> swap:0KB inactive_anon:0KB active_anon:44KB inactive_file:0KB
> active_file:0KB unevictable:0KB
> Jun 30 21:59:15 flink-tm-1 kernel: Memory cgroup stats for
> /kubepods.slice/kubepods-pod5ad5d2ea_5faa_4a11_96b4_39271ab76e99.slice/docker-a2295e812a828738810a8f1ae69cd48e99ef98b9e1038158a6e33f81524cc02a.scope:
> cache:180KB rss:28671776KB rss_huge:26437632KB mapped_file:144KB swap:0KB
> inactive_anon:0KB active_anon:28671760KB inactive_file:4KB active_file:4KB
> unevictable:0KB
> Jun 30 21:59:15 flink-tm-1 kernel: [ pid ]   uid  tgid total_vm  rss
> nr_ptes swapents oom_score_adj name
> Jun 30 21:59:15 flink-tm-1 kernel: [16875] 0 16875  2531
>  40  -998 pause
> Jun 30 21:59:15 flink-tm-1 kernel: [17274] 0 17274 1369  421
>  70  -998 bash
> Jun 30 21:59:15 flink-tm-1 kernel: [18089] 0 18089 10824832  7174316
>  145000  -998 java
> Jun 30 21:59:15 flink-tm-1 kernel: [18348] 0 18348 1017  196
>  60  -998 tail
> Jun 30 21:59:15 flink-tm-1 kernel: Memory cgroup out of memory: Kill
> process 26824 (Window(Tumbling) score 4 or sacrifice child
> Jun 30 21:59:15 flink-tm-1 kernel: Killed process 18089 (java)
> total-vm:43299328kB, anon-rss:28669084kB, file-rss:28180kB, shmem-rss:0kB
>
>
>
>
>
>
> Looking forward to your reply and help.
>
> Best

回复：作业因为异常restart后，频繁OOM

2020-06-30 文章 SmileSmile


补充一下，内核版本为 3.10.x，是否会是堆外内存cache没被回收而导致的内存超用？


| |
a511955993
|
|
邮箱：a511955...@163.com
|

签名由 网易邮箱大师 定制

在2020年06月30日 23:00，GuoSmileSmil 写道：
hi all，


我使用的Flink版本为1.10.1，使用的backend是rocksdb，没有开启checkpoint，运行在kubernetes平台上，模式是standalone。


目前遇到的问题是作业如果因为网络抖动或者硬件故障导致的pod被失联而fail，在pod重生后，作业自动restart，作业运行一段时间（半小时到1小时不等）很容易出现其他pod因为oom被os
 kill的现象，然后反复循环，pod 被kill越来越频繁。目前的解决方法是手动销毁这个集群，重新构建一个集群后重启作业，就恢复正常。


如果单纯heap的状态后台，作业restart不会出现这样的问题。


有一些不成熟的猜测，作业在fail后，native memory没有释放干净，pod的limit假设为10G，那么job 
restart后只有8G，TM还是按照10G的标准运行，pod使用的内存就会超过10G而被os kill（纯属猜测）。


请问大家是否有什么好的提议或者解决方法？


其中一次系统内核日志如下：


Jun 30 21:59:15 flink-tm-1 kernel: memory: usage 28672000kB, limit 28672000kB, 
failcnt 11225
Jun 30 21:59:15 flink-tm-1 kernel: memory+swap: usage 28672000kB, limit 
9007199254740988kB, failcnt 0
Jun 30 21:59:15 flink-tm-1 kernel: kmem: usage 0kB, limit 9007199254740988kB, 
failcnt 0
Jun 30 21:59:15 flink-tm-1 kernel: Memory cgroup stats for 
/kubepods.slice/kubepods-pod5ad5d2ea_5faa_4a11_96b4_39271ab76e99.slice: 
cache:0KB rss:0KB rss_huge:0KB mapped_file:0KB swap:0K
B inactive_anon:0KB active_anon:0KB inactive_file:0KB active_file:0KB 
unevictable:0KB
Jun 30 21:59:15 flink-tm-1 kernel: Memory cgroup stats for 
/kubepods.slice/kubepods-pod5ad5d2ea_5faa_4a11_96b4_39271ab76e99.slice/docker-fe101418a3b2a7c534e89b4ac73d29b04070eb923220a5b1
7338850bbdb3817a.scope: cache:0KB rss:44KB rss_huge:0KB mapped_file:0KB 
swap:0KB inactive_anon:0KB active_anon:44KB inactive_file:0KB active_file:0KB 
unevictable:0KB
Jun 30 21:59:15 flink-tm-1 kernel: Memory cgroup stats for 
/kubepods.slice/kubepods-pod5ad5d2ea_5faa_4a11_96b4_39271ab76e99.slice/docker-a2295e812a828738810a8f1ae69cd48e99ef98b9e1038158a6e33f81524cc02a.scope:
 cache:180KB rss:28671776KB rss_huge:26437632KB mapped_file:144KB swap:0KB 
inactive_anon:0KB active_anon:28671760KB inactive_file:4KB active_file:4KB 
unevictable:0KB
Jun 30 21:59:15 flink-tm-1 kernel: [ pid ]   uid  tgid total_vm  rss 
nr_ptes swapents oom_score_adj name
Jun 30 21:59:15 flink-tm-1 kernel: [16875] 0 16875  2531   
40  -998 pause
Jun 30 21:59:15 flink-tm-1 kernel: [17274] 0 17274 1369  421   
70  -998 bash
Jun 30 21:59:15 flink-tm-1 kernel: [18089] 0 18089 10824832  7174316   
145000  -998 java
Jun 30 21:59:15 flink-tm-1 kernel: [18348] 0 18348 1017  196   
60  -998 tail
Jun 30 21:59:15 flink-tm-1 kernel: Memory cgroup out of memory: Kill process 
26824 (Window(Tumbling) score 4 or sacrifice child
Jun 30 21:59:15 flink-tm-1 kernel: Killed process 18089 (java) 
total-vm:43299328kB, anon-rss:28669084kB, file-rss:28180kB, shmem-rss:0kB






Looking forward to your reply and help.

Best

作业因为异常restart后，频繁OOM

2020-06-30 文章 GuoSmileSmil

hi all，


我使用的Flink版本为1.10.1，使用的backend是rocksdb，没有开启checkpoint，运行在kubernetes平台上，模式是standalone。


目前遇到的问题是作业如果因为网络抖动或者硬件故障导致的pod被失联而fail，在pod重生后，作业自动restart，作业运行一段时间（半小时到1小时不等）很容易出现其他pod因为oom被os
 kill的现象，然后反复循环，pod 被kill越来越频繁。目前的解决方法是手动销毁这个集群，重新构建一个集群后重启作业，就恢复正常。


 如果单纯heap的状态后台，作业restart不会出现这样的问题。


有一些不成熟的猜测，作业在fail后，native memory没有释放干净，pod的limit假设为10G，那么job 
restart后只有8G，TM还是按照10G的标准运行，pod使用的内存就会超过10G而被os kill（纯属猜测）。


请问大家是否有什么好的提议或者解决方法？


其中一次系统内核日志如下：


Jun 30 21:59:15 flink-tm-1 kernel: memory: usage 28672000kB, limit 28672000kB, 
failcnt 11225
Jun 30 21:59:15 flink-tm-1 kernel: memory+swap: usage 28672000kB, limit 
9007199254740988kB, failcnt 0
Jun 30 21:59:15 flink-tm-1 kernel: kmem: usage 0kB, limit 9007199254740988kB, 
failcnt 0
Jun 30 21:59:15 flink-tm-1 kernel: Memory cgroup stats for 
/kubepods.slice/kubepods-pod5ad5d2ea_5faa_4a11_96b4_39271ab76e99.slice: 
cache:0KB rss:0KB rss_huge:0KB mapped_file:0KB swap:0K
B inactive_anon:0KB active_anon:0KB inactive_file:0KB active_file:0KB 
unevictable:0KB
Jun 30 21:59:15 flink-tm-1 kernel: Memory cgroup stats for 
/kubepods.slice/kubepods-pod5ad5d2ea_5faa_4a11_96b4_39271ab76e99.slice/docker-fe101418a3b2a7c534e89b4ac73d29b04070eb923220a5b1
7338850bbdb3817a.scope: cache:0KB rss:44KB rss_huge:0KB mapped_file:0KB 
swap:0KB inactive_anon:0KB active_anon:44KB inactive_file:0KB active_file:0KB 
unevictable:0KB
Jun 30 21:59:15 flink-tm-1 kernel: Memory cgroup stats for 
/kubepods.slice/kubepods-pod5ad5d2ea_5faa_4a11_96b4_39271ab76e99.slice/docker-a2295e812a828738810a8f1ae69cd48e99ef98b9e1038158a6e33f81524cc02a.scope:
 cache:180KB rss:28671776KB rss_huge:26437632KB mapped_file:144KB swap:0KB 
inactive_anon:0KB active_anon:28671760KB inactive_file:4KB active_file:4KB 
unevictable:0KB
Jun 30 21:59:15 flink-tm-1 kernel: [ pid ]   uid  tgid total_vm  rss 
nr_ptes swapents oom_score_adj name
Jun 30 21:59:15 flink-tm-1 kernel: [16875] 0 16875  2531   
40  -998 pause
Jun 30 21:59:15 flink-tm-1 kernel: [17274] 0 17274 1369  421   
70  -998 bash
Jun 30 21:59:15 flink-tm-1 kernel: [18089] 0 18089 10824832  7174316   
145000  -998 java
Jun 30 21:59:15 flink-tm-1 kernel: [18348] 0 18348 1017  196   
60  -998 tail
Jun 30 21:59:15 flink-tm-1 kernel: Memory cgroup out of memory: Kill process 
26824 (Window(Tumbling) score 4 or sacrifice child
Jun 30 21:59:15 flink-tm-1 kernel: Killed process 18089 (java) 
total-vm:43299328kB, anon-rss:28669084kB, file-rss:28180kB, shmem-rss:0kB






Looking forward to your reply and help.

Best

Re: 作业因为异常restart后，频繁OOM

Re: 作业因为异常restart后，频繁OOM

回复：作业因为异常restart后，频繁OOM

Re: 作业因为异常restart后，频繁OOM

回复：作业因为异常restart后，频繁OOM

作业因为异常restart后，频繁OOM

6 matches

Site Navigation

Mail list logo

Footer information