[ 
https://issues.apache.org/jira/browse/KAFKA-7061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16613312#comment-16613312
 ] 

Leon Widdershoven commented on KAFKA-7061:
------------------------------------------

I also feel this is a must-have, especially in an environment where events with 
the same key get processed by multiple nodes almost at the same time. The 
events do have timestamps, so they are ordered and can be deconflicted when on 
the non-compacted part of the log, but the compact policy does not take that 
into account. Compacting on the timestamp would, in my opinion, make the 
compact policy viable in a scenario where the same entities are updated 
frequently (e.g. within milliseconds).

> Enhanced log compaction
> -----------------------
>
>                 Key: KAFKA-7061
>                 URL: https://issues.apache.org/jira/browse/KAFKA-7061
>             Project: Kafka
>          Issue Type: Improvement
>          Components: core
>    Affects Versions: 2.0.0
>            Reporter: Luis Cabral
>            Priority: Major
>             Fix For: 2.1.0
>
>
> Enhance log compaction to support more than just offset comparison, so the 
> insertion order isn't dictating which records to keep.
> Default behavior is kept as it was, with the enhanced approached having to be 
> purposely activated.
> The enhanced compaction is done either via the record timestamp, by settings 
> the new configuration as "timestamp" or via the record headers by setting 
> this configuration to anything other than the default "offset" or the 
> reserved "timestamp".
> See 
> [KIP-280|https://cwiki.apache.org/confluence/display/KAFKA/KIP-280%3A+Enhanced+log+compaction]
>  for more details.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to