[ https://issues.apache.org/jira/browse/FLINK-38212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Grzegorz Liter updated FLINK-38212: ----------------------------------- Attachment: image-2025-08-07-17-14-35-648.png > OOM during savepoint caused by potential memory leak issue in RocksDB related > to jemalloc > ----------------------------------------------------------------------------------------- > > Key: FLINK-38212 > URL: https://issues.apache.org/jira/browse/FLINK-38212 > Project: Flink > Issue Type: Bug > Components: Runtime / Checkpointing > Affects Versions: 1.20.2, 2.1.0 > Environment: Flink 2.1.0 running in Application mode with Flink > Operator 1.12.1. > Memory and savepoint related settings: > {code:java} > env.java.opts.taskmanager: ' -XX:+UnlockExperimentalVMOptions > -XX:+UseStringDeduplication > -XX:+AlwaysPreTouch -XX:G1HeapRegionSize=16m > -Xlog:gc*:file=/tmp/gc.log:time,uptime,level,tags > -XX:SurvivorRatio=6 -XX:G1NewSizePercent=40 > execution.checkpointing.max-concurrent-checkpoints: "1" > execution.checkpointing.snapshot-compression: "true" > fs.s3a.aws.credentials.provider: > com.amazonaws.auth.WebIdentityTokenCredentialsProvider > fs.s3a.block.size: > fs.s3a.experimental.input.fadvise: sequential > fs.s3a.path.style.access: "true" > state.backend.incremental: "true" > state.backend.type: rocksdb > state.checkpoints.dir: s3p://bucket/checkpoints > state.savepoints.dir: s3p://bucket/savepoints > taskmanager.memory.jvm-overhead.fraction: "0.1" > taskmanager.memory.jvm-overhead.max: 6g > taskmanager.memory.managed.fraction: "0.4" > taskmanager.memory.network.fraction: "0.05" > taskmanager.network.memory.buffer-debloat.enabled: "true" > taskmanager.numberOfTaskSlots: "12" > ... > resource: > memory: 16g{code} > > Reporter: Grzegorz Liter > Priority: Major > Attachments: image-2025-08-07-17-13-33-041.png, > image-2025-08-07-17-14-03-023.png, image-2025-08-07-17-14-35-648.png, > image-2025-08-07-17-15-11-647.png > > > I am running a job with snapshot size about ~17 GB with compression enabled. > I have observed that savepoints often fails due to TM getting killed by > Kubernetes due to exceeding memory limit on pod that had 30 GB of memory > limit assigned. > Flink metrics nor detailed VM metrics taken with `jcmd <PID> VM.native_memory > detail` does not indicate any unusual memory increase. Consumed memory is > visible only in Kubernetes metrics and RSS. > When enough memory set (+ potentially setting enough jvm overhead) to leave > some breathing room one snapshot could be taken but taking subsequent full > snapshots reliably leads to OOM. > This documentation: > [switching-the-memory-allocator|https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/resource-providers/standalone/docker/#switching-the-memory-allocator] > have lead me to trying > {code:java} > MALLOC_ARENA_MAX=1 > DISABLE_JEMALLOC=true {code} > This configuration helped to make savepoint reliably pass without OOM. I have > trying setting only one of each options at once but that was not fixing the > issue. > I also tried downscaling pod down to 16 GB of memory and with these options > savepoint was reliably created without any issue. Without them every > savepoint fails. -- This message was sent by Atlassian Jira (v8.20.10#820010)