xiongqi wu created KAFKA-7321:
---------------------------------

             Summary: ensure timely processing of deletion requests in Kafka 
topic (Time-based log compaction)
                 Key: KAFKA-7321
                 URL: https://issues.apache.org/jira/browse/KAFKA-7321
             Project: Kafka
          Issue Type: Improvement
          Components: log
            Reporter: xiongqi wu


_Compaction enables Kafka to remove old messages that are flagged for deletion 
while other messages can be retained for a relatively longer time.  Today, a 
log segment may remain un-compacted for a long time since the eligibility for 
log compaction is determined based on compaction ratio 
(“min.cleanable.dirty.ratio”) and min compaction lag ("min.compaction.lag.ms") 
setting.  Ability to delete a log message through compaction in a timely manner 
has become an important requirement in some use cases (e.g., GDPR).  For 
example,  one use case is to delete PII (Personal Identifiable information) 
data within 7 days while keeping non-PII indefinitely in compacted format.  The 
goal of this change is to provide a time-based compaction policy that ensures 
the cleanable section is compacted after the specified time interval regardless 
of dirty ratio and “min compaction lag”.  However, dirty ratio and “min 
compaction lag” are still honored if the time based compaction rule is not 
violated. In other words, if Kafka receives a deletion request on a key (e..g, 
a key with null value), the corresponding log segment will be picked up for 
compaction after the configured time interval to remove the key._

 

_This is to track effort in KIP 354:_

_https://cwiki.apache.org/confluence/display/KAFKA/KIP-354%3A+Time-based+log+compaction+policy_



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to