Keith Lee created FLINK-39923:
---------------------------------
Summary: RocksDB Statistics native memory leaks on state backend
rebuild when ticker metrics are enabled
Key: FLINK-39923
URL: https://issues.apache.org/jira/browse/FLINK-39923
Project: Flink
Issue Type: Bug
Components: Runtime / State Backends
Affects Versions: 2.1.3, 2.2.1, 2.0.2
Reporter: Keith Lee
When any of the 11 RocksDB ticker-type metric options is enabled, the
TaskManager leaks native memory in proportion to the number of keyed state
backend rebuilds (job restarts, rescaling, recovery cascades).
Ticker type metric:
{quote}state.backend.rocksdb.metrics.block-cache-hit
state.backend.rocksdb.metrics.block-cache-miss
state.backend.rocksdb.metrics.bloom-filter-useful
state.backend.rocksdb.metrics.bloom-filter-full-positive
state.backend.rocksdb.metrics.bloom-filter-full-true-positive
state.backend.rocksdb.metrics.bytes-read
state.backend.rocksdb.metrics.iter-bytes-read
state.backend.rocksdb.metrics.bytes-written
state.backend.rocksdb.metrics.compaction-read-bytes
state.backend.rocksdb.metrics.compaction-write-bytes
state.backend.rocksdb.metrics.stall-micros
{quote}
This issue was reproduced and confirmed as OOMKill was observed within 80
seconds of submitting a continuously failing job to Flink cluster configured
with low restart delay and ticker style metrics enabled. See here for
reproduction instructions and scripts:
[https://github.com/leekeiabstraction/flink/tree/reproduce-rocksdb-statistics-leak/reproduce-rocksdb-statistics-leak]
See dotfile output of jeprof (jemalloc profiling needs to be enabled) points to
770MB memory allocated in rocksdb StatisticsJni.
{quote}Legend
[shape=box,fontsize=24,shape=plaintext,label="/proc/307/exe\lTotal B:
2855914662\lFocusing on: 2855914662\lDropped nodes with <=
[14279573|tel:14279573] abs(B)\lDropped edges with <= [2855914|tel:2855914]
B\l"];
N1 [label="je_prof_backtrace\n0 (0.0%)\rof [2040910591|tel:2040910591]
(71.5%)\r",shape=box,fontsize=8.0];
N2 [label="je_prof_tctx_create\n0 (0.0%)\rof [2040910591|tel:2040910591]
(71.5%)\r",shape=box,fontsize=8.0];
N3 [label="prof_backtrace_impl\n2040910591 (71.5%)\r",shape=box,fontsize=50.3];
N4 [label="je_malloc_default\n0 (0.0%)\rof [2032208910|tel:2032208910]
(71.2%)\r",shape=box,fontsize=8.0];
N5 [label="Unsafe_AllocateMemory0\n0 (0.0%)\rof [1874666648|tel:1874666648]
(65.6%)\r",shape=box,fontsize=8.0];
N6 [label="os\nmalloc@d01a60\n0 (0.0%)\rof [1874666648|tel:1874666648]
(65.6%)\r",shape=box,fontsize=8.0];
N7 [label="0x00007fb705ffd460\n0 (0.0%)\rof [1874578289|tel:1874578289]
(65.6%)\r",shape=box,fontsize=8.0];
N8 [label="Java_org_rocksdb_Statistics_newStatistics___3BJ\n0 (0.0%)\rof
[807469136|tel:807469136](28.3%)\r",shape=box,fontsize=8.0];
N9 [label="rocksdb\nCoreLocalArray\nCoreLocalArray\n0 (0.0%)\rof
[807403520|tel:807403520] (28.3%)\r",shape=box,fontsize=8.0];
N10 [label="rocksdb\nStatisticsImpl\nStatisticsImpl\n0 (0.0%)\rof
[807403520|tel:807403520] (28.3%)\r",shape=box,fontsize=8.0];
N11 [label="rocksdb\nStatisticsJni\nStatisticsJni\n0 (0.0%)\rof
[807403520|tel:807403520] (28.3%)\r",shape=box,fontsize=8.0];
N12 [label="rocksdb\nport\ncacheline_aligned_alloc\n807403520
(28.3%)\r",shape=box,fontsize=34.6];
N13 [label="0x00007fb7068b4ceb\n0 (0.0%)\rof [578879568|tel:578879568]
(20.3%)\r",shape=box,fontsize=8.0];
{quote}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)