Hi all,

Since there has been a rejuvenated interest in this KIP, it felt better to 
downgrade it back down to [DISCUSSION], as we aren't really voting on it 
anymore.
I'll try to address the currently pending questions on the following points, so 
please bear with me while we go through them all:

1) Configuration: Topic vs Global

Here we all agree that having a configuration per-topic is the best option. 
However, this is not possible with the current design of the compaction 
solution. Yes, it is true that "some" properties that relate to compaction are 
configurable per-topic, but those are only the properties that act outside(!) 
of the actual compaction logic, such as configuring the start-compaction 
trigger with "ratio" or compaction duration with "lag.ms".
This logic can, of course, be re-designed to suit our wishes, but this is not a 
direct effort, and if we have spent months arguing about the extra 8 bytes per 
record, for sure we would spend another few dozen months discussing the memory 
impact that doing this change to the properties will invariably have.
As such, I will limit this KIP to ONLY have these configurations globally.

2) Configuration: Fail-fast vs Fallback


Ideally, I would also like to prevent the application to start if the 
configuration is somehow invalid.
However (another 'however'), the way the configuration is handled prevents 
adding dependencies between them, so we can't add logic that says 
"configuration X is invalid if configuration Y is so-and-such".
Again, this can be re-designed to add this feature to the configuration logic, 
but it would again be a big change just by itself, so this KIP is again limited 
to use ONLY what is already in place.

3) Documenting the memory impact on the KIP

This is now added to the KIP, though this topic is more complicated than 
'memory impact'. E.g.: this change doesn't translate to an actual memory 
impact, it just means that the compaction will potentially encompass less 
records per execution.

4) Documenting how we deal with the last message in the log

I have 2 interpretations for this request: "the last message in the log" or 
"the last message with a shared key on the log"
For the former: there is no change to the logic on how the last message is 
handled. Only the "tail" gets compacted, so the "head" (which includes the last 
message) still keeps the last message

5) Documenting how the key deletion will be handled

I'm having some trouble understanding this one; do you mean how keys are 
deleted in general, or?

Cheers,
Luis Cabral

   On Friday, August 24, 2018, 1:54:54 AM GMT+2, Jun Rao <j...@confluent.io> 
wrote:  
 
 Hi, Luis,

Thanks for the reply. A few more comments below.

1. About the topic level configuration. It seems that it's useful for the
new configs to be at the topic level. Currently, the following configs
related to compaction are already at the topic level.

min.cleanable.dirty.ratio
min.compaction.lag.ms
cleanup.policy

2. Have you documented the memory impact in the KIP?

3. Could you document how we deal with the last message in the log, which
is potentially cleanable now?

4. Could you document how key deletion will be handled?

10. As for Jason's proposal on CompactionStrategy, it does make the feature
more general. On the other hand, it will be useful not to require
user-level code if the compaction value only comes from the header.

20. "If compaction.strategy.header is chosen and compaction.strategy.header
is not set, the KIP falls back to offset." I am wondering if it's better to
just fail the configuration in the case.

Jun



On Thu, Aug 16, 2018 at 1:33 PM, Guozhang Wang <wangg...@gmail.com> wrote:

> Regarding "broker-agnostic of headers": there are some KIPs from Streams to
> use headers for internal purposes as well, e.g. KIP-258 and KIP-213 (I
> admit there may be a conflict with user space, but practically I think it
> is very rare). So I think we are very likely going to make Kafka internals
> to be "headers-aware" anyways.
>
> Regarding the general API: I think it is a good idea in general, but it may
> still have limits: note that right now our KIP enforce a header type to be
> long, and we have a very careful discussion about the fall-back policy if
> header does not have the specified key or if the value is not long-typed;
> but if we enforce long type version in the interface, it would require
> users trying to customizing their compaction logic (think: based on some
> value payload field) to transform their fields to long as well. So I feel
> we can still go with the current proposed approach, and only consider this
> general API if we observe it does have a general usage requirement. By that
> time we can still extend the config values of "log.cleaner.compaction.
> strategy" to "offset, timestamp, header, myFuncName".
>
> @Bertus
>
> Thanks for your feedback, I believe the proposed config is indeed for both
> global (for the whole broker) and per-topic, Luís can confirm if this is
> the case, and update the wiki page to make it clear.
>
>
> Guozhang
>
>
> On Thu, Aug 16, 2018 at 9:09 AM, Bertus Greeff <
> bgre...@microsoft.com.invalid> wrote:
>
> > I'm interested to know the status of this KIP.  I see that the status is
> > "Voting".  How long does this normally take?
> >
> > We want to use Kafka and this KIP provides exactly the log compaction
> > logic that we want for many of our projects.
> >
> > One piece of feedback that I have is that log.cleaner.compaction.
> strategy
> > and log.cleaner.compaction.strategy.header needs to be per topic.  The
> > text of the KIP makes it sound that the config is only available for all
> > topics but this makes no sense.  Different topics will need different
> > strategies and/or headers.
> >
> > From the KIP:
> > Provide the configuration for the individual topics
> > None of the configurations for log compaction are available at topic
> > level, so adding it there is not a part of this KIP
> >
> >
> >
> > On 2018/04/05 08:44:00, Luís Cabral <l...@yahoo.com.INVALID> wrote:
> > > Hello all,>
> > > Starting a discussion for this feature.>
> > > KIP-280  :  https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 280%
> > 3A+Enhanced+log+compactionPull-4822 :  https://github.com/apache/kafk
> > a/pull/4822>
> >
> > > Kind Regards,Luís>
> >
>
>
>
> --
> -- Guozhang
>  

Reply via email to