[
https://issues.apache.org/jira/browse/KAFKA-15609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17782229#comment-17782229
]
Alexandre Dupriez commented on KAFKA-15609:
-------------------------------------------
The nature - private or shared - of a memory mapping have visibility
implications between processes, but from within the same process consistency
should always be guaranteed.
"Flushing" a memory-mapped file to the block device can be initiated with the
{{msync}} syscall but that operation is not necessary for the visibility
guarantees which are questioned in this ticket.
A succinct description of memory mapping and can be found in {_}Understanding
the Linux Kernel, Third Edition{_}, edition O'Reilly, page 657-668.
> Corrupted index uploaded to remote tier
> ---------------------------------------
>
> Key: KAFKA-15609
> URL: https://issues.apache.org/jira/browse/KAFKA-15609
> Project: Kafka
> Issue Type: Bug
> Components: Tiered-Storage
> Affects Versions: 3.6.0
> Reporter: Divij Vaidya
> Priority: Minor
>
> While testing Tiered Storage, we have observed corrupt indexes being present
> in remote tier. One such situation is covered here at
> https://issues.apache.org/jira/browse/KAFKA-15401. This Jira presents another
> such possible case of corruption.
> Potential cause of index corruption:
> We want to ensure that the file we are passing to RSM plugin contains all the
> data which is present in MemoryByteBuffer i.e. we should have flushed the
> MemoryByteBuffer to the file using force(). In Kafka, when we close a
> segment, indexes are flushed asynchronously [1]. Hence, it might be possible
> that when we are passing the file to RSM, the file doesn't contain flushed
> data. Hence, we may end up uploading indexes which haven't been flushed yet.
> Ideally, the contract should enforce that we force flush the content of
> MemoryByteBuffer before we give the file for RSM. This will ensure that
> indexes are not corrupted/incomplete.
> [1]
> [https://github.com/apache/kafka/blob/4150595b0a2e0f45f2827cebc60bcb6f6558745d/core/src/main/scala/kafka/log/UnifiedLog.scala#L1613]
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)