[
https://issues.apache.org/jira/browse/KAFKA-20616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Bill Bejeck updated KAFKA-20616:
--------------------------------
Description:
Primary leak (KAFKA-20456 follow-up). RocksDBStore.createOffsetsCFOptions()
returns a new ColumnFamilyOptions() that is passed to a ColumnFamilyDescriptor
and then dropped — it is never assigned to a field
and never closed. On the JNI side, constructing a ColumnFamilyOptions
auto-allocates a default BlockBasedTableFactory with an 8 MB LRUCache. The
leak compounds per segment, per task — windowed/segmented stores amplify it
heavily.
Secondary leak (KIP-1035 close path). AbstractColumnFamilyAccessor.close()
writes a closedState marker to the offsets CF; if that write throws (which
happens during the EOSv2 cascade or unclean shutdown —
a case the existing code comment already acknowledges), the subsequent
offsetColumnFamilyHandle.close() is skipped. SingleColumnFamilyAccessor.close()
and DualColumnFamilyAccessor.close() have the same
non-finally ordering, so the data CF (and oldCF/newCF for migrating stores)
handles also leak whenever super.close() propagates. RocksDBStore.close()
swallows the resulting RocksDBException, so the leak is silent.
was:
Primary leak (KAFKA-20456 follow-up). RocksDBStore.createOffsetsCFOptions()
returns a new ColumnFamilyOptions() that is passed to a ColumnFamilyDescriptor
and then dropped — it is never assigned to a field
and never closed. On the JNI side, constructing a ColumnFamilyOptions
auto-allocates a default BlockBasedTableFactory with an 8 MB LRUCache. Native
heap profiles from the soak confirm this directly:
Java_org_rocksdb_ColumnFamilyOptions_newColumnFamilyOptions →
BlockBasedTableFactory::InitializeOptions → LRUCacheOptions::MakeSharedCache
accounts for 5.5 GB (70%) on soak1 and 2.6 GB (54%) on soak2. The
leak compounds per segment, per task — windowed/segmented stores amplify it
heavily.
Secondary leak (KIP-1035 close path). AbstractColumnFamilyAccessor.close()
writes a closedState marker to the offsets CF; if that write throws (which
happens during the EOSv2 cascade or unclean shutdown —
a case the existing code comment already acknowledges), the subsequent
offsetColumnFamilyHandle.close() is skipped. SingleColumnFamilyAccessor.close()
and DualColumnFamilyAccessor.close() have the same
non-finally ordering, so the data CF (and oldCF/newCF for migrating stores)
handles also leak whenever super.close() propagates. RocksDBStore.close()
swallows the resulting RocksDBException, so the leak is
silent.
> Close-path leaks in RocksDBStore cause native memory growth that eventually
> leads to OOM
> ----------------------------------------------------------------------------------------
>
> Key: KAFKA-20616
> URL: https://issues.apache.org/jira/browse/KAFKA-20616
> Project: Kafka
> Issue Type: Bug
> Components: streams
> Affects Versions: 4.3.0, 4.4.0
> Reporter: Bill Bejeck
> Assignee: Bill Bejeck
> Priority: Blocker
> Fix For: 4.3.1, 4.4.0
>
>
> Primary leak (KAFKA-20456 follow-up). RocksDBStore.createOffsetsCFOptions()
> returns a new ColumnFamilyOptions() that is passed to a
> ColumnFamilyDescriptor and then dropped — it is never assigned to a field
> and never closed. On the JNI side, constructing a ColumnFamilyOptions
> auto-allocates a default BlockBasedTableFactory with an 8 MB LRUCache. The
> leak compounds per segment, per task — windowed/segmented stores amplify it
> heavily.
>
>
>
> Secondary leak (KIP-1035 close path). AbstractColumnFamilyAccessor.close()
> writes a closedState marker to the offsets CF; if that write throws (which
> happens during the EOSv2 cascade or unclean shutdown —
> a case the existing code comment already acknowledges), the subsequent
> offsetColumnFamilyHandle.close() is skipped.
> SingleColumnFamilyAccessor.close() and DualColumnFamilyAccessor.close() have
> the same
> non-finally ordering, so the data CF (and oldCF/newCF for migrating stores)
> handles also leak whenever super.close() propagates. RocksDBStore.close()
> swallows the resulting RocksDBException, so the leak is silent.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)