[ 
https://issues.apache.org/jira/browse/KAFKA-4545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Rao resolved KAFKA-4545.
----------------------------
    Fix Version/s: 3.1.0
       Resolution: Fixed

This is fixed in KAFKA-8522.

> tombstone needs to be removed after delete.retention.ms has passed after it 
> has been cleaned
> --------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-4545
>                 URL: https://issues.apache.org/jira/browse/KAFKA-4545
>             Project: Kafka
>          Issue Type: Improvement
>          Components: log
>    Affects Versions: 0.10.0.0, 0.11.0.0, 1.0.0
>            Reporter: Jun Rao
>            Assignee: Richard Yu
>            Priority: Minor
>              Labels: needs-kip
>             Fix For: 3.1.0
>
>
> The algorithm for removing the tombstone in a compacted is supposed to be the 
> following.
> 1. Tombstone is never removed when it's still in the dirty portion of the log.
> 2. After the tombstone is in the cleaned portion of the log, we further delay 
> the removal of the tombstone by delete.retention.ms since the time the 
> tombstone is in the cleaned portion.
> Once the tombstone is in the cleaned portion, we know there can't be any 
> message with the same key before the tombstone. Therefore, for any consumer, 
> if it reads a non-tombstone message before the tombstone, but can read to the 
> end of the log within delete.retention.ms, it's guaranteed to see the 
> tombstone.
> However, the current implementation doesn't seem correct. We delay the 
> removal of the tombstone by delete.retention.ms since the last modified time 
> of the last cleaned segment. However, the last modified time is inherited 
> from the original segment, which could be arbitrarily old. So, the tombstone 
> may not be preserved as long as it needs to be.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to