Hi Kenan, If you have confirmed the heap memory is ok(e.g. no Java OOM exception and no frequent GC), then the cause may be off-heap memory over usage, especially when your flink job uses some native library. To diagnose such problem, you can refer to [1][2] for more details about using NMT and jeprof.
[1] https://erikwramner.files.wordpress.com/2017/10/native-memory-leaks-in-java.pdf [2] https://www.evanjones.ca/java-native-leak-bug.html Best, Biao Geng Kenan Kılıçtepe <kkilict...@gmail.com> 于2023年9月6日周三 20:32写道: > Hi, > > I have Flink 1.16.2 on a single server with 64GB Ram. > > Although taskmanager.memory.process.size is set to 40000m, I can see > memory usage of the task manager exceed 59GB and OS kills it because of > OOM. > I check the RSS column of application top for memory usage. > > I don`t see any heap memory problem. > > taskmanager.memory.process.size: 40000m > taskmanager.memory.managed.fraction: 0.53 > state.backend.rocksdb.memory.managed: true > > Any help is appreciated for analyzing the problem. > > Thanks > >