[ https://issues.apache.org/jira/browse/KAFKA-4545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jun Rao resolved KAFKA-4545. ---------------------------- Fix Version/s: 3.1.0 Resolution: Fixed This is fixed in KAFKA-8522. > tombstone needs to be removed after delete.retention.ms has passed after it > has been cleaned > -------------------------------------------------------------------------------------------- > > Key: KAFKA-4545 > URL: https://issues.apache.org/jira/browse/KAFKA-4545 > Project: Kafka > Issue Type: Improvement > Components: log > Affects Versions: 0.10.0.0, 0.11.0.0, 1.0.0 > Reporter: Jun Rao > Assignee: Richard Yu > Priority: Minor > Labels: needs-kip > Fix For: 3.1.0 > > > The algorithm for removing the tombstone in a compacted is supposed to be the > following. > 1. Tombstone is never removed when it's still in the dirty portion of the log. > 2. After the tombstone is in the cleaned portion of the log, we further delay > the removal of the tombstone by delete.retention.ms since the time the > tombstone is in the cleaned portion. > Once the tombstone is in the cleaned portion, we know there can't be any > message with the same key before the tombstone. Therefore, for any consumer, > if it reads a non-tombstone message before the tombstone, but can read to the > end of the log within delete.retention.ms, it's guaranteed to see the > tombstone. > However, the current implementation doesn't seem correct. We delay the > removal of the tombstone by delete.retention.ms since the last modified time > of the last cleaned segment. However, the last modified time is inherited > from the original segment, which could be arbitrarily old. So, the tombstone > may not be preserved as long as it needs to be. -- This message was sent by Atlassian Jira (v8.20.10#820010)