[jira] [Commented] (KAFKA-7061) Enhanced log compaction

Dirk Pitt (Jira) Tue, 05 Jan 2021 03:55:07 -0800


    [ 
https://issues.apache.org/jira/browse/KAFKA-7061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17258854#comment-17258854
 ]


Dirk Pitt commented on KAFKA-7061:
----------------------------------

As I understand this change will add two more additional Compaction Strategies. 
Can I ask why the start way of configuring a class that will implement the 
custom compaction strategy, like custom partiioner...

a la
{code:java}
"partitioner.class", "com.some.company.PartitionStrategyImpl.class"
{code}
 

which would be much flexible, of course the person who implements the strategy 
must now what he/she is doing....

My problem, I have such a key for my message,

uuid1_othertoken1 -> message1v1

uuid1 is actually defining my message but I have to use 'othertoken1' for my 
partition logic, after while I can get the message1v2

uuid1_othertoken2 -> message1v2

while message1v2 will be the new version of the message1v1, kafka will not able 
to compact the message....

With a custom compaction strategy I could say kafka to take only uuid1 for 
consideration for compaction.

Is there any reason, whry this path is not followed?

 

 

> Enhanced log compaction
> -----------------------
>
>                 Key: KAFKA-7061
>                 URL: https://issues.apache.org/jira/browse/KAFKA-7061
>             Project: Kafka
>          Issue Type: Improvement
>          Components: core
>    Affects Versions: 2.5.0
>            Reporter: Luis Cabral
>            Assignee: Senthilnathan Muthusamy
>            Priority: Major
>              Labels: kip
>
> Enhance log compaction to support more than just offset comparison, so the 
> insertion order isn't dictating which records to keep.
> Default behavior is kept as it was, with the enhanced approached having to be 
> purposely activated.
>  The enhanced compaction is done either via the record timestamp, by settings 
> the new configuration as "timestamp" or via the record headers by setting 
> this configuration to anything other than the default "offset" or the 
> reserved "timestamp".
> See 
> [KIP-280|https://cwiki.apache.org/confluence/display/KAFKA/KIP-280%3A+Enhanced+log+compaction]
>  for more details.
> +From Guozhang:+ We should emphasize on the WIKI that the newly introduced 
> config yields to the existing "log.cleanup.policy", i.e. if the latter's 
> value is `delete` not `compact`, then the previous config would be ignored.
> +From Jun Rao:+ With the timestamp/header strategy, the behavior of the 
> application may need to change. In particular, the application can't just 
> blindly take the record with a larger offset and assuming that it's the value 
> to keep. It needs to check the timestamp or the header now. So, it would be 
> useful to at least document this. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (KAFKA-7061) Enhanced log compaction

Reply via email to