[ 
https://issues.apache.org/jira/browse/KAFKA-19390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masahiro Mori reassigned KAFKA-19390:
-------------------------------------

    Assignee: Masahiro Mori

> AbstractIndex#resize() does not release old mmap on Linux
> ---------------------------------------------------------
>
>                 Key: KAFKA-19390
>                 URL: https://issues.apache.org/jira/browse/KAFKA-19390
>             Project: Kafka
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 3.8.1
>            Reporter: Masahiro Mori
>            Assignee: Masahiro Mori
>            Priority: Major
>              Labels: Linux
>
> Our kafka broker crashed with the following error:
> {code:java}
> [2025-03-29 09:37:03,218] ERROR Error while appending records to 
> <topic>-<partition> in dir /kafka-logs/data ...
> java.io.IOException: Map failed
> at java.base/sun.nio.ch.FileChannelImpl.mapInternal(FileChannelImpl.java:1127)
> at java.base/sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:1032)
> at 
> org.apache.kafka.storage.internals.log.AbstractIndex.createMappedBuffer(AbstractIndex.java:467)
> at 
> org.apache.kafka.storage.internals.log.AbstractIndex.createAndAssignMmap(AbstractIndex.java:105)
> at 
> org.apache.kafka.storage.internals.log.AbstractIndex.<init>(AbstractIndex.java:83)
> at org.apache.kafka.storage.internals.log.TimeIndex.<init>(TimeIndex.java:65)
> at 
> org.apache.kafka.storage.internals.log.LazyIndex.loadIndex(LazyIndex.java:242)
> at org.apache.kafka.storage.internals.log.LazyIndex.get(LazyIndex.java:179)
> at 
> org.apache.kafka.storage.internals.log.LogSegment.timeIndex(LogSegment.java:146)
> at 
> org.apache.kafka.storage.internals.log.LogSegment.readMaxTimestampAndOffsetSoFar(LogSegment.java:201)
> at 
> org.apache.kafka.storage.internals.log.LogSegment.maxTimestampSoFar(LogSegment.java:211)
> at 
> org.apache.kafka.storage.internals.log.LogSegment.append(LogSegment.java:262)
> at kafka.log.LocalLog.append(LocalLog.scala:417)
> ...
> Caused by: java.lang.OutOfMemoryError: Map failed
> at java.base/sun.nio.ch.FileChannelImpl.map0(Native Method)
> at java.base/sun.nio.ch.FileChannelImpl.mapInternal(FileChannelImpl.java:1124)
> ... 33 more{code}
> We found that kafka process hit the vm.max_map_count limit (which was set to 
> 262144) and most of the mapped entries correspond to deleted index files.
> {code:java}
> > sudo cat /proc/${KAFKA_PID}/maps | grep deleted
> 7d8c5cc00000-7d8c5d600000 rw-s 00000000 08:11 202854769 
> /kafka-logs/data/topic1-22/00000000332910579773.timeindex.deleted (deleted)
> 7d8c5d800000-7d8c5e200000 rw-s 00000000 08:11 202854768 
> /kafka-logs/data/topic1-22/00000000332910579773.index.deleted (deleted)
> 7d8c67400000-7d8c67e00000 rw-s 00000000 08:11 202562514 
> /kafka-logs/data/topic2-116/00000000165968090794.timeindex.deleted (deleted)
> 7d8c68000000-7d8c68a00000 rw-s 00000000 08:11 202562513 
> /kafka-logs/data/topic2-116/00000000165968090794.index.deleted (deleted)
> 7d8c6d400000-7d8c6de00000 rw-s 00000000 08:11 202596518 
> /kafka-logs/data/topic2-356/00000000168702579081.timeindex.deleted (deleted)
> 7d8c6e000000-7d8c6ea00000 rw-s 00000000 08:11 202596517 
> /kafka-logs/data/topic2-356/00000000168702579081.index.deleted (deleted)
> 7d8c71c00000-7d8c72600000 rw-s 00000000 08:11 202798981 
> /kafka-logs/data/topic3-433/00000000116740630582.timeindex.deleted (deleted)
> 7d8c72800000-7d8c73200000 rw-s 00000000 08:11 202798980 
> /kafka-logs/data/topic3-433/00000000116740630582.index.deleted (deleted)
> 7d8c77c00000-7d8c78600000 rw-s 00000000 08:11 202754947 
> /kafka-logs/data/topic3-74/00000000118067749684.timeindex.deleted (deleted)
> 7d8c78800000-7d8c79200000 rw-s 00000000 08:11 202754946 
> /kafka-logs/data/topic3-74/00000000118067749684.index.deleted (deleted)
> 7d8c79400000-7d8c79e00000 rw-s 00000000 08:11 202813710 
> /kafka-logs/data/topic2-82/00000000162756700035.timeindex.deleted (deleted)
> 7d8c7a000000-7d8c7aa00000 rw-s 00000000 08:11 202813709 
> /kafka-logs/data/topic2-82/00000000162756700035.index.deleted (deleted)
> 7d8c7ac00000-7d8c7b600000 rw-s 00000000 08:11 202596526 
> /kafka-logs/data/topic2-355/00000000169939763750.timeindex.deleted (deleted)
> 7d8c7b800000-7d8c7c200000 rw-s 00000000 08:11 202596525 
> /kafka-logs/data/topic2-355/00000000169939763750.index.deleted (deleted)
> 7d8c7c400000-7d8c7ce00000 rw-s 00000000 08:11 202562498 
> /kafka-logs/data/topic2-295/00000000168913981903.timeindex.deleted (deleted)
> 7d8c7d000000-7d8c7da00000 rw-s 00000000 08:11 202562497 
> /kafka-logs/data/topic2-295/00000000168913981903.index.deleted (deleted)
> 7d8c80c00000-7d8c81600000 rw-s 00000000 08:11 202754939 
> /kafka-logs/data/topic3-13/00000000115588098896.timeindex.deleted (deleted)
> 7d8c81800000-7d8c82200000 rw-s 00000000 08:11 202754938 
> /kafka-logs/data/topic3-13/00000000115588098896.index.deleted (deleted)
> 7d8c83c00000-7d8c84600000 rw-s 00000000 08:11 202798989 
> /kafka-logs/data/topic3-314/00000000118254254601.timeindex.deleted (deleted)
> 7d8c84800000-7d8c85200000 rw-s 00000000 08:11 202798988 
> /kafka-logs/data/topic3-314/00000000118254254601.index.deleted (deleted)
> ...{code}
> In AbstractIndex.resize(), the old memory mapping is explicitly unmapped on 
> windows or z/OS using safeForceUnmap(), but on Linux the unmapping step is 
> skipped.
> The same issue was originally reported in KAFKA-7442, but the corresponding 
> pull request was never merged.
> We propose that resize() should call safeForceUnmap() on all platforms to 
> prevent stale mappings from lingering.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to