Hasil Sharma created KAFKA-19014:
------------------------------------

             Summary: Potential race condition in remote-log-reader and 
remote-log-index-cleaner thread
                 Key: KAFKA-19014
                 URL: https://issues.apache.org/jira/browse/KAFKA-19014
             Project: Kafka
          Issue Type: Bug
    Affects Versions: 3.8.1
            Reporter: Hasil Sharma


Chain of events:

*Thread - 1 remote-log-reader*

1/ Fetches the offsetIndex from the indexCache which internally maps the 
physical offset index file as MappedByteBuffer. 

OffsetIndex offsetIndex = 
indexCache.getIndexEntry(segmentMetadata).offsetIndex(); 
([here|https://github.com/apache/kafka/blob/cf7029c0264fd7f7b15c2e98acc874ec8c3403f2/core/src/main/java/kafka/log/remote/RemoteLogManager.java#L1772])

*Thread - 2 index cache thread*

Entry is marked for cleanup i.e physical offset index file is renamed.

*Thread - 3 remote-log-index-cleaner*

Physical offset index file is deleted.

*Thread - 1 remote-log-reader*

Attempts run binary search on the MappedByteBuffer that is mapped to a 
non-existent file.

long upperBoundOffset = offsetIndex.fetchUpperBoundOffset(startOffsetPosition, 
fetchSize).map(position -> position.offset).orElse(segmentMetadata.endOffset() 
+ 1); 
([here|https://github.com/apache/kafka/blob/3.8/core/src/main/java/kafka/log/remote/RemoteLogManager.java#L1619])

 

Results in JVM fatal error (SIGSEV) with stack trace:

 
{code:java}
Stack: [0x000072ee9112d000,0x000072ee9122d000],  sp=0x000072ee9122b360,  free 
space=1016kNative frames: (J=compiled Java code, j=interpreted, Vv=VM code, 
C=native code)J 6483 c2 java.nio.DirectByteBuffer.getInt(I)I java.base@17.0.14 
(28 bytes) @ 0x000072f23d2f12f1 [0x000072f23d2f12a0+0x0000000000000051]j  
org.apache.kafka.storage.internals.log.OffsetIndex.relativeOffset(Ljava/nio/ByteBuffer;I)I+5j
  
org.apache.kafka.storage.internals.log.OffsetIndex.parseEntry(Ljava/nio/ByteBuffer;I)Lorg/apache/kafka/storage/internals/log/OffsetPosition;+11j
  
org.apache.kafka.storage.internals.log.OffsetIndex.parseEntry(Ljava/nio/ByteBuffer;I)Lorg/apache/kafka/storage/internals/log/IndexEntry;+3j
  
org.apache.kafka.storage.internals.log.AbstractIndex.binarySearch(Ljava/nio/ByteBuffer;JLorg/apache/kafka/storage/internals/log/IndexSearchType;Lorg/apache/kafka/storage/internals/log/AbstractIndex$SearchResultType;II)I+30j
  
org.apache.kafka.storage.internals.log.AbstractIndex.indexSlotRangeFor(Ljava/nio/ByteBuffer;JLorg/apache/kafka/storage/internals/log/IndexSearchType;Lorg/apache/kafka/storage/internals/log/AbstractIndex$SearchResultType;)I+126j
  
org.apache.kafka.storage.internals.log.AbstractIndex.smallestUpperBoundSlotFor(Ljava/nio/ByteBuffer;JLorg/apache/kafka/storage/internals/log/IndexSearchType;)I+8
 {code}
 

 

As per MappedByteBuffer documentation 
([here|https://devdocs.io/openjdk~17/java.base/java/nio/mappedbytebuffer]):

All or part of a mapped byte buffer may become inaccessible at any time, for 
example if the mapped file is truncated. An attempt to access an inaccessible 
region of a mapped byte buffer will not change the buffer's content and will 
cause an unspecified exception to be thrown either at the time of the access or 
at some later time. It is therefore strongly recommended that appropriate 
precautions be taken to avoid the manipulation of a mapped file by this 
program, or by a concurrently running program, except to read or write the 
file's content.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to