Hi Dong,

Thank you for your comment.  See my inline comments.
I will update the KIP shortly.

Xiongqi (Wesley) Wu


On Sun, Oct 28, 2018 at 9:17 PM Dong Lin <lindon...@gmail.com> wrote:

> Hey Xiongqi,
>
> Sorry for late reply. I have some comments below:
>
> 1) As discussed earlier in the email list, if the topic is configured with
> both deletion and compaction, in some cases messages produced a long time
> ago can not be deleted based on time. This is a valid use-case because we
> actually have topic which is configured with both deletion and compaction
> policy. And we should enforce the semantics for both policy. Solution A
> sounds good. We do not need interface change (e.g. extra config) to enforce
> solution A. All we need is to update implementation so that when broker
> compacts a topic, if the message has timestamp (which is the common case),
> messages that are too old (based on the time-based retention config) will
> be discarded. Since this is a valid issue and it is also related to the
> guarantee of when a message can be deleted, can we include the solution of
> this problem in the KIP?
>
======  This makes sense.  We can use similar approach to increase the log
start offset.

>
> 2) It is probably OK to assume that all messages have timestamp. The
> per-message timestamp was introduced into Kafka 0.10.0 with KIP-31 and
> KIP-32 as of Feb 2016. Kafka 0.10.0 or earlier versions are no longer
> supported. Also, since the use-case for this feature is primarily for GDPR,
> we can assume that client library has already been upgraded to support SSL,
> which feature is added after KIP-31 and KIP-32.
>
>  =========>  Ok. We can use message timestamp to delete expired records if
both compaction and retention are enabled.


3) In Proposed Change section 2.a, it is said that segment.largestTimestamp
> - maxSegmentMs can be used to determine the timestamp of the earliest
> message. Would it be simpler to just use the create time of the file to
> determine the time?
>
> ========>  Linux/Java doesn't provide API for file creation time because
some filesystem type doesn't provide file creation time.


> 4) The KIP suggests to use must-clean-ratio to select the partition to be
> compacted. Unlike dirty ratio which is mostly for performance, the logs
> whose "must-clean-ratio" is non-zero must be compacted immediately for
> correctness reason (and for GDPR). And if this can no be achieved because
> e.g. broker compaction throughput is too low, investigation will be needed.
> So it seems simpler to first compact logs which has segment whose earliest
> timetamp is earlier than now - max.compaction.lag.ms, instead of defining
> must-clean-ratio and sorting logs based on this value.
>
>
======>  Good suggestion. This can simply the implementation quite a bit if
we are not too concerned about compaction of GDPR required partition queued
behind some large partition.  The actual compaction completion time is not
guaranteed anyway.


> 5) The KIP says max.compaction.lag.ms is 0 by default and it is also
> suggested that 0 means disable. Should we set this value to MAX_LONG by
> default to effectively disable the feature added in this KIP?
>
> ====> I would rather use 0 so the corresponding code path will not be
exercised.  By using MAX_LONG, we would theoretically go through related
code to find out whether the partition is required to be compacted to
satisfy MAX_LONG.

6) It is probably cleaner and readable not to include in Public Interface
> section those configs whose meaning is not changed.
>
> ====> I will clean that up.

7) The goal of this KIP is to ensure that log segment whose earliest
> message is earlier than a given threshold will be compacted. This goal may
> not be achieved if the compact throughput can not catchup with the total
> bytes-in-rate for the compacted topics on the broker. Thus we need an easy
> way to tell operator whether this goal is achieved. If we don't already
> have such metric, maybe we can include metrics to show 1) the total number
> of log segments (or logs) which needs to be immediately compacted as
> determined by max.compaction.lag; and 2) the maximum value of now -
> earliest_time_stamp_of_segment among all segments that needs to be
> compacted.
>
> =======> good suggestion.  I will update KIP for these metrics.

8) The Performance Impact suggests user to use the existing metrics to
> monitor the performance impact of this KIP. It i useful to list mean of
> each jmx metrics that we want user to monitor, and possibly explain how to
> interpret the value of these metrics to determine whether there is
> performance issue.
>
> =========>  I will update the KIP.

> Thanks,
> Dong
>
> On Tue, Oct 16, 2018 at 10:53 AM xiongqi wu <xiongq...@gmail.com> wrote:
>
> > Mayuresh,
> >
> > Thanks for the comments.
> > The requirement is that we need to pick up segments that are older than
> > maxCompactionLagMs for compaction.
> > maxCompactionLagMs is an upper-bound, which implies that picking up
> > segments for compaction earlier doesn't violated the policy.
> > We use the creation time of a segment as an estimation of its records
> > arrival time, so these records can be compacted no later than
> > maxCompactionLagMs.
> >
> > On the other hand, compaction is an expensive operation, we don't want to
> > compact the log partition whenever a new segment is sealed.
> > Therefore, we want to pick up a segment for compaction when the segment
> is
> > closed to mandatory max compaction lag (so we use segment creation time
> as
> > an estimation.)
> >
> >
> > Xiongqi (Wesley) Wu
> >
> >
> > On Mon, Oct 15, 2018 at 5:54 PM Mayuresh Gharat <
> > gharatmayures...@gmail.com>
> > wrote:
> >
> > > Hi Wesley,
> > >
> > > Thanks for the KIP and sorry for being late to the party.
> > >  I wanted to understand, the scenario you mentioned in Proposed
> changes :
> > >
> > > -
> > > >
> > > > Estimate the earliest message timestamp of an un-compacted log
> segment.
> > > we
> > > > only need to estimate earliest message timestamp for un-compacted log
> > > > segments to ensure timely compaction because the deletion requests
> that
> > > > belong to compacted segments have already been processed.
> > > >
> > > >    1.
> > > >
> > > >    for the first (earliest) log segment:  The estimated earliest
> > > >    timestamp is set to the timestamp of the first message if
> timestamp
> > is
> > > >    present in the message. Otherwise, the estimated earliest
> timestamp
> > > is set
> > > >    to "segment.largestTimestamp - maxSegmentMs”
> > > >     (segment.largestTimestamp is lastModified time of the log segment
> > or
> > > max
> > > >    timestamp we see for the log segment.). In the later case, the
> > actual
> > > >    timestamp of the first message might be later than the estimation,
> > > but it
> > > >    is safe to pick up the log for compaction earlier.
> > > >
> > > > When we say "actual timestamp of the first message might be later
> than
> > > the
> > > estimation, but it is safe to pick up the log for compaction earlier.",
> > > doesn't that violate the assumption that we will consider a segment for
> > > compaction only if the time of creation the segment has crossed the
> "now
> > -
> > > maxCompactionLagMs" ?
> > >
> > > Thanks,
> > >
> > > Mayuresh
> > >
> > > On Mon, Sep 3, 2018 at 7:28 PM Brett Rann <br...@zendesk.com.invalid>
> > > wrote:
> > >
> > > > Might also be worth moving to a vote thread? Discussion seems to have
> > > gone
> > > > as far as it can.
> > > >
> > > > > On 4 Sep 2018, at 12:08, xiongqi wu <xiongq...@gmail.com> wrote:
> > > > >
> > > > > Brett,
> > > > >
> > > > > Yes, I will post PR tomorrow.
> > > > >
> > > > > Xiongqi (Wesley) Wu
> > > > >
> > > > >
> > > > > On Sun, Sep 2, 2018 at 6:28 PM Brett Rann
> <br...@zendesk.com.invalid
> > >
> > > > wrote:
> > > > >
> > > > > > +1 (non-binding) from me on the interface. I'd like to see
> someone
> > > > familiar
> > > > > > with
> > > > > > the code comment on the approach, and note there's a couple of
> > > > different
> > > > > > approaches: what's documented in the KIP, and what Xiaohe Dong
> was
> > > > working
> > > > > > on
> > > > > > here:
> > > > > >
> > > > > >
> > > >
> > >
> >
> https://github.com/dongxiaohe/kafka/tree/dongxiaohe/log-cleaner-compaction-max-lifetime-2.0
> > > > > >
> > > > > > If you have code working already Xiongqi Wu could you share a PR?
> > I'd
> > > > be
> > > > > > happy
> > > > > > to start testing.
> > > > > >
> > > > > > On Tue, Aug 28, 2018 at 5:57 AM xiongqi wu <xiongq...@gmail.com>
> > > > wrote:
> > > > > >
> > > > > > > Hi All,
> > > > > > >
> > > > > > > Do you have any additional comments on this KIP?
> > > > > > >
> > > > > > >
> > > > > > > On Thu, Aug 16, 2018 at 9:17 PM, xiongqi wu <
> xiongq...@gmail.com
> > >
> > > > wrote:
> > > > > > >
> > > > > > > > on 2)
> > > > > > > > The offsetmap is built starting from dirty segment.
> > > > > > > > The compaction starts from the beginning of the log
> partition.
> > > > That's
> > > > > > how
> > > > > > > > it ensure the deletion of tomb keys.
> > > > > > > > I will double check tomorrow.
> > > > > > > >
> > > > > > > > Xiongqi (Wesley) Wu
> > > > > > > >
> > > > > > > >
> > > > > > > > On Thu, Aug 16, 2018 at 6:46 PM Brett Rann
> > > > <br...@zendesk.com.invalid>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > >> To just clarify a bit on 1. whether there's an external
> > > storage/DB
> > > > > > isn't
> > > > > > > >> relevant here.
> > > > > > > >> Compacted topics allow a tombstone record to be sent (a null
> > > value
> > > > > > for a
> > > > > > > >> key) which
> > > > > > > >> currently will result in old values for that key being
> deleted
> > > if
> > > > some
> > > > > > > >> conditions are met.
> > > > > > > >> There are existing controls to make sure the old values will
> > > stay
> > > > > > around
> > > > > > > >> for a minimum
> > > > > > > >> time at least, but no dedicated control to ensure the
> > tombstone
> > > > will
> > > > > > > >> delete
> > > > > > > >> within a
> > > > > > > >> maximum time.
> > > > > > > >>
> > > > > > > >> One popular reason that maximum time for deletion is
> desirable
> > > > right
> > > > > > now
> > > > > > > >> is
> > > > > > > >> GDPR with
> > > > > > > >> PII. But we're not proposing any GDPR awareness in kafka,
> just
> > > > being
> > > > > > > able
> > > > > > > >> to guarantee
> > > > > > > >> a max time where a tombstoned key will be removed from the
> > > > compacted
> > > > > > > >> topic.
> > > > > > > >>
> > > > > > > >> on 2)
> > > > > > > >> huh, i thought it kept track of the first dirty segment and
> > > didn't
> > > > > > > >> recompact older "clean" ones.
> > > > > > > >> But I didn't look at code or test for that.
> > > > > > > >>
> > > > > > > >> On Fri, Aug 17, 2018 at 10:57 AM xiongqi wu <
> > > xiongq...@gmail.com>
> > > > > > > wrote:
> > > > > > > >>
> > > > > > > >> > 1, Owner of data (in this sense, kafka is the not the
> owner
> > of
> > > > data)
> > > > > > > >> > should keep track of lifecycle of the data in some
> external
> > > > > > > storage/DB.
> > > > > > > >> > The owner determines when to delete the data and send the
> > > delete
> > > > > > > >> request to
> > > > > > > >> > kafka. Kafka doesn't know about the content of data but to
> > > > provide a
> > > > > > > >> mean
> > > > > > > >> > for deletion.
> > > > > > > >> >
> > > > > > > >> > 2 , each time compaction runs, it will start from first
> > > > segments (no
> > > > > > > >> > matter if it is compacted or not). The time estimation
> here
> > is
> > > > only
> > > > > > > used
> > > > > > > >> > to determine whether we should run compaction on this log
> > > > partition.
> > > > > > > So
> > > > > > > >> we
> > > > > > > >> > only need to estimate uncompacted segments.
> > > > > > > >> >
> > > > > > > >> > On Thu, Aug 16, 2018 at 5:35 PM, Dong Lin <
> > > lindon...@gmail.com>
> > > > > > > wrote:
> > > > > > > >> >
> > > > > > > >> > > Hey Xiongqi,
> > > > > > > >> > >
> > > > > > > >> > > Thanks for the update. I have two questions for the
> latest
> > > > KIP.
> > > > > > > >> > >
> > > > > > > >> > > 1) The motivation section says that one use case is to
> > > delete
> > > > PII
> > > > > > > >> > (Personal
> > > > > > > >> > > Identifiable information) data within 7 days while
> keeping
> > > > non-PII
> > > > > > > >> > > indefinitely in compacted format. I suppose the use-case
> > > > depends
> > > > > > on
> > > > > > > >> the
> > > > > > > >> > > application to determine when to delete those PII data.
> > > Could
> > > > you
> > > > > > > >> explain
> > > > > > > >> > > how can application reliably determine the set of keys
> > that
> > > > should
> > > > > > > be
> > > > > > > >> > > deleted? Is application required to always messages from
> > the
> > > > topic
> > > > > > > >> after
> > > > > > > >> > > every restart and determine the keys to be deleted by
> > > looking
> > > > at
> > > > > > > >> message
> > > > > > > >> > > timestamp, or is application supposed to persist the
> key->
> > > > > > timstamp
> > > > > > > >> > > information in a separate persistent storage system?
> > > > > > > >> > >
> > > > > > > >> > > 2) It is mentioned in the KIP that "we only need to
> > estimate
> > > > > > > earliest
> > > > > > > >> > > message timestamp for un-compacted log segments because
> > the
> > > > > > deletion
> > > > > > > >> > > requests that belong to compacted segments have already
> > been
> > > > > > > >> processed".
> > > > > > > >> > > Not sure if it is correct. If a segment is compacted
> > before
> > > > user
> > > > > > > sends
> > > > > > > >> > > message to delete a key in this segment, it seems that
> we
> > > > still
> > > > > > need
> > > > > > > >> to
> > > > > > > >> > > ensure that the segment will be compacted again within
> the
> > > > given
> > > > > > > time
> > > > > > > >> > after
> > > > > > > >> > > the deletion is requested, right?
> > > > > > > >> > >
> > > > > > > >> > > Thanks,
> > > > > > > >> > > Dong
> > > > > > > >> > >
> > > > > > > >> > > On Thu, Aug 16, 2018 at 10:27 AM, xiongqi wu <
> > > > xiongq...@gmail.com
> > > > > > >
> > > > > > > >> > wrote:
> > > > > > > >> > >
> > > > > > > >> > > > Hi Xiaohe,
> > > > > > > >> > > >
> > > > > > > >> > > > Quick note:
> > > > > > > >> > > > 1) Use minimum of segment.ms and
> max.compaction.lag.ms
> > > > > > > >> > > > <http://max.compaction.ms
> > > > > > > <http://max.compaction.ms>
> > > > > > > >> > <http://max.compaction.ms
> > > > > > > <http://max.compaction.ms>>>
> > > > > > > >> > > >
> > > > > > > >> > > > 2) I am not sure if I get your second question. first,
> > we
> > > > have
> > > > > > > >> jitter
> > > > > > > >> > > when
> > > > > > > >> > > > we roll the active segment. second, on each
> compaction,
> > we
> > > > > > compact
> > > > > > > >> upto
> > > > > > > >> > > > the offsetmap could allow. Those will not lead to
> > perfect
> > > > > > > compaction
> > > > > > > >> > > storm
> > > > > > > >> > > > overtime. In addition, I expect we are setting
> > > > > > > >> max.compaction.lag.ms
> > > > > > > >> > on
> > > > > > > >> > > > the order of days.
> > > > > > > >> > > >
> > > > > > > >> > > > 3) I don't have access to the confluent community
> slack
> > > for
> > > > > > now. I
> > > > > > > >> am
> > > > > > > >> > > > reachable via the google handle out.
> > > > > > > >> > > > To avoid the double effort, here is my plan:
> > > > > > > >> > > > a) Collect more feedback and feature requriement on
> the
> > > KIP.
> > > > > > > >> > > > b) Wait unitl this KIP is approved.
> > > > > > > >> > > > c) I will address any additional requirements in the
> > > > > > > implementation.
> > > > > > > >> > (My
> > > > > > > >> > > > current implementation only complies to whatever
> > described
> > > > in
> > > > > > the
> > > > > > > >> KIP
> > > > > > > >> > > now)
> > > > > > > >> > > > d) I can share the code with the you and community see
> > you
> > > > want
> > > > > > to
> > > > > > > >> add
> > > > > > > >> > > > anything.
> > > > > > > >> > > > e) submission through committee
> > > > > > > >> > > >
> > > > > > > >> > > >
> > > > > > > >> > > > On Wed, Aug 15, 2018 at 11:42 PM, XIAOHE DONG <
> > > > > > > >> dannyriv...@gmail.com>
> > > > > > > >> > > > wrote:
> > > > > > > >> > > >
> > > > > > > >> > > > > Hi Xiongqi
> > > > > > > >> > > > >
> > > > > > > >> > > > > Thanks for thinking about implementing this as well.
> > :)
> > > > > > > >> > > > >
> > > > > > > >> > > > > I was thinking about using `segment.ms` to trigger
> > the
> > > > > > segment
> > > > > > > >> roll.
> > > > > > > >> > > > > Also, its value can be the largest time bias for the
> > > > record
> > > > > > > >> deletion.
> > > > > > > >> > > For
> > > > > > > >> > > > > example, if the `segment.ms` is 1 day and `
> > > > max.compaction.ms`
> > > > > > > is
> > > > > > > >> 30
> > > > > > > >> > > > days,
> > > > > > > >> > > > > the compaction may happen around 31 days.
> > > > > > > >> > > > >
> > > > > > > >> > > > > For my curiosity, is there a way we can do some
> > > > performance
> > > > > > test
> > > > > > > >> for
> > > > > > > >> > > this
> > > > > > > >> > > > > and any tools you can recommend. As you know,
> > > previously,
> > > > it
> > > > > > is
> > > > > > > >> > cleaned
> > > > > > > >> > > > up
> > > > > > > >> > > > > by respecting dirty ratio, but now it may happen
> > anytime
> > > > if
> > > > > > max
> > > > > > > >> lag
> > > > > > > >> > has
> > > > > > > >> > > > > passed for each message. I wonder what would happen
> if
> > > > clients
> > > > > > > >> send
> > > > > > > >> > > huge
> > > > > > > >> > > > > amount of tombstone records at the same time.
> > > > > > > >> > > > >
> > > > > > > >> > > > > I am looking forward to have a quick chat with you
> to
> > > > avoid
> > > > > > > double
> > > > > > > >> > > effort
> > > > > > > >> > > > > on this. I am in confluent community slack during
> the
> > > work
> > > > > > time.
> > > > > > > >> My
> > > > > > > >> > > name
> > > > > > > >> > > > is
> > > > > > > >> > > > > Xiaohe Dong. :)
> > > > > > > >> > > > >
> > > > > > > >> > > > > Rgds
> > > > > > > >> > > > > Xiaohe Dong
> > > > > > > >> > > > >
> > > > > > > >> > > > >
> > > > > > > >> > > > >
> > > > > > > >> > > > > On 2018/08/16 01:22:22, xiongqi wu <
> > xiongq...@gmail.com
> > > >
> > > > > > wrote:
> > > > > > > >> > > > > > Brett,
> > > > > > > >> > > > > >
> > > > > > > >> > > > > > Thank you for your comments.
> > > > > > > >> > > > > > I was thinking since we already has immediate
> > > compaction
> > > > > > > >> setting by
> > > > > > > >> > > > > setting
> > > > > > > >> > > > > > min dirty ratio to 0, so I decide to use "0" as
> > > disabled
> > > > > > > state.
> > > > > > > >> > > > > > I am ok to go with -1(disable), 0 (immediate)
> > options.
> > > > > > > >> > > > > >
> > > > > > > >> > > > > > For the implementation, there are a few
> differences
> > > > between
> > > > > > > mine
> > > > > > > >> > and
> > > > > > > >> > > > > > "Xiaohe Dong"'s :
> > > > > > > >> > > > > > 1) I used the estimated creation time of a log
> > segment
> > > > > > instead
> > > > > > > >> of
> > > > > > > >> > > > largest
> > > > > > > >> > > > > > timestamp of a log to determine the compaction
> > > > eligibility,
> > > > > > > >> > because a
> > > > > > > >> > > > log
> > > > > > > >> > > > > > segment might stay as an active segment up to "max
> > > > > > compaction
> > > > > > > >> lag".
> > > > > > > >> > > > (see
> > > > > > > >> > > > > > the KIP for detail).
> > > > > > > >> > > > > > 2) I measure how much bytes that we must clean to
> > > > follow the
> > > > > > > >> "max
> > > > > > > >> > > > > > compaction lag" rule, and use that to determine
> the
> > > > order of
> > > > > > > >> > > > compaction.
> > > > > > > >> > > > > > 3) force active segment to roll to follow the "max
> > > > > > compaction
> > > > > > > >> lag"
> > > > > > > >> > > > > >
> > > > > > > >> > > > > > I can share my code so we can coordinate.
> > > > > > > >> > > > > >
> > > > > > > >> > > > > > I haven't think about a new API to force a
> > compaction.
> > > > what
> > > > > > is
> > > > > > > >> the
> > > > > > > >> > > use
> > > > > > > >> > > > > case
> > > > > > > >> > > > > > for this one?
> > > > > > > >> > > > > >
> > > > > > > >> > > > > >
> > > > > > > >> > > > > > On Wed, Aug 15, 2018 at 5:33 PM, Brett Rann
> > > > > > > >> > > <br...@zendesk.com.invalid
> > > > > > > >> > > > >
> > > > > > > >> > > > > > wrote:
> > > > > > > >> > > > > >
> > > > > > > >> > > > > > > We've been looking into this too.
> > > > > > > >> > > > > > >
> > > > > > > >> > > > > > > Mailing list:
> > > > > > > >> > > > > > > https://lists.apache.org/thread.html/
> > > > > > > <https://lists.apache.org/thread.html/>
> > > > > > > >> > <https://lists.apache.org/thread.html/
> > > > > > > <https://lists.apache.org/thread.html/>>
> > > > > > > >> > > ed7f6a6589f94e8c2a705553f364ef
> > > > > > > >> > > > > > > 599cb6915e4c3ba9b561e610e4@%
> > 3Cdev.kafka.apache.org
> > > %3E
> > > > > > > >> > > > > > > jira wish:
> > > > > > https://issues.apache.org/jira/browse/KAFKA-7137
> > > > > > > <https://issues.apache.org/jira/browse/KAFKA-7137>
> > > > > > > >> > <https://issues.apache.org/jira/browse/KAFKA-7137
> > > > > > > <https://issues.apache.org/jira/browse/KAFKA-7137>>
> > > > > > > >> > > > > > > confluent slack discussion:
> > > > > > > >> > > > > > >
> > > > https://confluentcommunity.slack.com/archives/C49R61XMM/
> > > > > > > <https://confluentcommunity.slack.com/archives/C49R61XMM/>
> > > > > > > >> > <https://confluentcommunity.slack.com/archives/C49R61XMM/
> > > > > > > <https://confluentcommunity.slack.com/archives/C49R61XMM/>>
> > > > > > > >> > > > > p1530760121000039
> > > > > > > >> > > > > > >
> > > > > > > >> > > > > > > A person on my team has started on code so you
> > might
> > > > want
> > > > > > to
> > > > > > > >> > > > > coordinate:
> > > > > > > >> > > > > > >
> > > > https://github.com/dongxiaohe/kafka/tree/dongxiaohe/log-
> > > > > > > <https://github.com/dongxiaohe/kafka/tree/dongxiaohe/log->
> > > > > > > >> > <https://github.com/dongxiaohe/kafka/tree/dongxiaohe/log-
> > > > > > > <https://github.com/dongxiaohe/kafka/tree/dongxiaohe/log->>
> > > > > > > >> > > > > > > cleaner-compaction-max-lifetime-2.0
> > > > > > > >> > > > > > >
> > > > > > > >> > > > > > > He's been working with Jason Gustafson and James
> > > Chen
> > > > > > around
> > > > > > > >> the
> > > > > > > >> > > > > changes.
> > > > > > > >> > > > > > > You can ping him on confluent slack as Xiaohe
> > Dong.
> > > > > > > >> > > > > > >
> > > > > > > >> > > > > > > It's great to know others are thinking on it as
> > > well.
> > > > > > > >> > > > > > >
> > > > > > > >> > > > > > > You've added the requirement to force a segment
> > roll
> > > > which
> > > > > > > we
> > > > > > > >> > > hadn't
> > > > > > > >> > > > > gotten
> > > > > > > >> > > > > > > to yet, which is great. I was content with it
> not
> > > > > > including
> > > > > > > >> the
> > > > > > > >> > > > active
> > > > > > > >> > > > > > > segment.
> > > > > > > >> > > > > > >
> > > > > > > >> > > > > > > > Adding topic level configuration "
> > > > max.compaction.lag.ms
> > > > > > ",
> > > > > > > >> and
> > > > > > > >> > > > > > > corresponding broker configuration "
> > > > > > > >> > log.cleaner.max.compaction.la
> > > > > > > >> > > > g.ms
> > > > > > > >> > > > > ",
> > > > > > > >> > > > > > > which is set to 0 (disabled) by default.
> > > > > > > >> > > > > > >
> > > > > > > >> > > > > > > Glancing at some other settings convention seems
> > to
> > > > me to
> > > > > > be
> > > > > > > >> -1
> > > > > > > >> > for
> > > > > > > >> > > > > > > disabled (or infinite, which is more meaningful
> > > > here). 0
> > > > > > to
> > > > > > > me
> > > > > > > >> > > > implies
> > > > > > > >> > > > > > > instant, a little quicker than 1.
> > > > > > > >> > > > > > >
> > > > > > > >> > > > > > > We've been trying to think about a way to
> trigger
> > > > > > compaction
> > > > > > > >> as
> > > > > > > >> > > well
> > > > > > > >> > > > > > > through an API call, which would need to be
> > flagged
> > > > > > > somewhere
> > > > > > > >> (ZK
> > > > > > > >> > > > > admin/
> > > > > > > >> > > > > > > space?) but we're struggling to think how that
> > would
> > > > be
> > > > > > > >> > coordinated
> > > > > > > >> > > > > across
> > > > > > > >> > > > > > > brokers and partitions. Have you given any
> thought
> > > to
> > > > > > that?
> > > > > > > >> > > > > > >
> > > > > > > >> > > > > > >
> > > > > > > >> > > > > > >
> > > > > > > >> > > > > > >
> > > > > > > >> > > > > > >
> > > > > > > >> > > > > > >
> > > > > > > >> > > > > > > On Thu, Aug 16, 2018 at 8:44 AM xiongqi wu <
> > > > > > > >> xiongq...@gmail.com>
> > > > > > > >> > > > > wrote:
> > > > > > > >> > > > > > >
> > > > > > > >> > > > > > > > Eno, Dong,
> > > > > > > >> > > > > > > >
> > > > > > > >> > > > > > > > I have updated the KIP. We decide not to
> address
> > > the
> > > > > > issue
> > > > > > > >> that
> > > > > > > >> > > we
> > > > > > > >> > > > > might
> > > > > > > >> > > > > > > > have for both compaction and time retention
> > > enabled
> > > > > > topics
> > > > > > > >> (see
> > > > > > > >> > > the
> > > > > > > >> > > > > > > > rejected alternative item 2). This KIP will
> only
> > > > ensure
> > > > > > > log
> > > > > > > >> can
> > > > > > > >> > > be
> > > > > > > >> > > > > > > > compacted after a specified time-interval.
> > > > > > > >> > > > > > > >
> > > > > > > >> > > > > > > > As suggested by Dong, we will also enforce "
> > > > > > > >> > > max.compaction.lag.ms"
> > > > > > > >> > > > > is
> > > > > > > >> > > > > > > not
> > > > > > > >> > > > > > > > less than "min.compaction.lag.ms".
> > > > > > > >> > > > > > > >
> > > > > > > >> > > > > > > >
> > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-354
> > > > > > > <https://cwiki.apache.org/confluence/display/KAFKA/KIP-354>
> > > > > > > >> > <
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-354
> > > > > > > <https://cwiki.apache.org/confluence/display/KAFKA/KIP-354>>
> > > > > > > >> > > > > Time-based
> > > > > > > >> > > > > > > log
> > > > > > > >> > > > > > > > compaction policy
> > > > > > > >> > > > > > > > <
> > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-354
> > > > > > > <https://cwiki.apache.org/confluence/display/KAFKA/KIP-354>
> > > > > > > >> > <
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-354
> > > > > > > <https://cwiki.apache.org/confluence/display/KAFKA/KIP-354>>
> > > > > > > >> > > > > Time-based
> > > > > > > >> > > > > > > log compaction policy>
> > > > > > > >> > > > > > > >
> > > > > > > >> > > > > > > >
> > > > > > > >> > > > > > > > On Tue, Aug 14, 2018 at 5:01 PM, xiongqi wu <
> > > > > > > >> > xiongq...@gmail.com
> > > > > > > >> > > >
> > > > > > > >> > > > > wrote:
> > > > > > > >> > > > > > > >
> > > > > > > >> > > > > > > > >
> > > > > > > >> > > > > > > > > Per discussion with Dong, he made a very
> good
> > > > point
> > > > > > that
> > > > > > > >> if
> > > > > > > >> > > > > compaction
> > > > > > > >> > > > > > > > > and time based retention are both enabled
> on a
> > > > topic,
> > > > > > > the
> > > > > > > >> > > > > compaction
> > > > > > > >> > > > > > > > might
> > > > > > > >> > > > > > > > > prevent records from being deleted on time.
> > The
> > > > reason
> > > > > > > is
> > > > > > > >> > when
> > > > > > > >> > > > > > > compacting
> > > > > > > >> > > > > > > > > multiple segments into one single segment,
> the
> > > > newly
> > > > > > > >> created
> > > > > > > >> > > > > segment
> > > > > > > >> > > > > > > will
> > > > > > > >> > > > > > > > > have same lastmodified timestamp as latest
> > > > original
> > > > > > > >> segment.
> > > > > > > >> > We
> > > > > > > >> > > > > lose
> > > > > > > >> > > > > > > the
> > > > > > > >> > > > > > > > > timestamp of all original segments except
> the
> > > last
> > > > > > one.
> > > > > > > >> As a
> > > > > > > >> > > > > result,
> > > > > > > >> > > > > > > > > records might not be deleted as it should be
> > > > through
> > > > > > > time
> > > > > > > >> > based
> > > > > > > >> > > > > > > > retention.
> > > > > > > >> > > > > > > > >
> > > > > > > >> > > > > > > > > With the current KIP proposal, if we want to
> > > > ensure
> > > > > > > timely
> > > > > > > >> > > > > deletion, we
> > > > > > > >> > > > > > > > > have the following configurations:
> > > > > > > >> > > > > > > > > 1) enable time based log compaction only :
> > > > deletion is
> > > > > > > >> done
> > > > > > > >> > > > though
> > > > > > > >> > > > > > > > > overriding the same key
> > > > > > > >> > > > > > > > > 2) enable time based log retention only:
> > > deletion
> > > > is
> > > > > > > done
> > > > > > > >> > > though
> > > > > > > >> > > > > > > > > time-based retention
> > > > > > > >> > > > > > > > > 3) enable both log compaction and time based
> > > > > > retention:
> > > > > > > >> > > Deletion
> > > > > > > >> > > > > is not
> > > > > > > >> > > > > > > > > guaranteed.
> > > > > > > >> > > > > > > > >
> > > > > > > >> > > > > > > > > Not sure if we have use case 3 and also want
> > > > deletion
> > > > > > to
> > > > > > > >> > happen
> > > > > > > >> > > > on
> > > > > > > >> > > > > > > time.
> > > > > > > >> > > > > > > > > There are several options to address
> deletion
> > > > issue
> > > > > > when
> > > > > > > >> > enable
> > > > > > > >> > > > > both
> > > > > > > >> > > > > > > > > compaction and retention:
> > > > > > > >> > > > > > > > > A) During log compaction, looking into
> record
> > > > > > timestamp
> > > > > > > to
> > > > > > > >> > > delete
> > > > > > > >> > > > > > > expired
> > > > > > > >> > > > > > > > > records. This can be done in compaction
> logic
> > > > itself
> > > > > > or
> > > > > > > >> use
> > > > > > > >> > > > > > > > > AdminClient.deleteRecords() . But this
> assumes
> > > we
> > > > have
> > > > > > > >> record
> > > > > > > >> > > > > > > timestamp.
> > > > > > > >> > > > > > > > > B) retain the lastModifed time of original
> > > > segments
> > > > > > > during
> > > > > > > >> > log
> > > > > > > >> > > > > > > > compaction.
> > > > > > > >> > > > > > > > > This requires extra meta data to record the
> > > > > > information
> > > > > > > or
> > > > > > > >> > not
> > > > > > > >> > > > > grouping
> > > > > > > >> > > > > > > > > multiple segments into one during
> compaction.
> > > > > > > >> > > > > > > > >
> > > > > > > >> > > > > > > > > If we have use case 3 in general, I would
> > prefer
> > > > > > > solution
> > > > > > > >> A
> > > > > > > >> > and
> > > > > > > >> > > > > rely on
> > > > > > > >> > > > > > > > > record timestamp.
> > > > > > > >> > > > > > > > >
> > > > > > > >> > > > > > > > >
> > > > > > > >> > > > > > > > > Two questions:
> > > > > > > >> > > > > > > > > Do we have use case 3? Is it nice to have or
> > > must
> > > > > > have?
> > > > > > > >> > > > > > > > > If we have use case 3 and want to go with
> > > > solution A,
> > > > > > > >> should
> > > > > > > >> > we
> > > > > > > >> > > > > > > introduce
> > > > > > > >> > > > > > > > > a new configuration to enforce deletion by
> > > > timestamp?
> > > > > > > >> > > > > > > > >
> > > > > > > >> > > > > > > > >
> > > > > > > >> > > > > > > > > On Tue, Aug 14, 2018 at 1:52 PM, xiongqi wu
> <
> > > > > > > >> > > xiongq...@gmail.com
> > > > > > > >> > > > >
> > > > > > > >> > > > > > > wrote:
> > > > > > > >> > > > > > > > >
> > > > > > > >> > > > > > > > >> Dong,
> > > > > > > >> > > > > > > > >>
> > > > > > > >> > > > > > > > >> Thanks for the comment.
> > > > > > > >> > > > > > > > >>
> > > > > > > >> > > > > > > > >> There are two retention policy: log
> > compaction
> > > > and
> > > > > > time
> > > > > > > >> > based
> > > > > > > >> > > > > > > retention.
> > > > > > > >> > > > > > > > >>
> > > > > > > >> > > > > > > > >> Log compaction:
> > > > > > > >> > > > > > > > >>
> > > > > > > >> > > > > > > > >> we have use cases to keep infinite
> retention
> > > of a
> > > > > > topic
> > > > > > > >> > (only
> > > > > > > >> > > > > > > > >> compaction). GDPR cares about deletion of
> PII
> > > > > > (personal
> > > > > > > >> > > > > identifiable
> > > > > > > >> > > > > > > > >> information) data.
> > > > > > > >> > > > > > > > >> Since Kafka doesn't know what records
> contain
> > > > PII, it
> > > > > > > >> relies
> > > > > > > >> > > on
> > > > > > > >> > > > > upper
> > > > > > > >> > > > > > > > >> layer to delete those records.
> > > > > > > >> > > > > > > > >> For those infinite retention uses uses,
> kafka
> > > > needs
> > > > > > to
> > > > > > > >> > > provide a
> > > > > > > >> > > > > way
> > > > > > > >> > > > > > > to
> > > > > > > >> > > > > > > > >> enforce compaction on time. This is what we
> > try
> > > > to
> > > > > > > >> address
> > > > > > > >> > in
> > > > > > > >> > > > this
> > > > > > > >> > > > > > > KIP.
> > > > > > > >> > > > > > > > >>
> > > > > > > >> > > > > > > > >> Time based retention,
> > > > > > > >> > > > > > > > >>
> > > > > > > >> > > > > > > > >> There are also use cases that users of
> Kafka
> > > > might
> > > > > > want
> > > > > > > >> to
> > > > > > > >> > > > expire
> > > > > > > >> > > > > all
> > > > > > > >> > > > > > > > >> their data.
> > > > > > > >> > > > > > > > >> In those cases, they can use time based
> > > > retention of
> > > > > > > >> their
> > > > > > > >> > > > topics.
> > > > > > > >> > > > > > > > >>
> > > > > > > >> > > > > > > > >>
> > > > > > > >> > > > > > > > >> Regarding your first question, if a user
> > wants
> > > to
> > > > > > > delete
> > > > > > > >> a
> > > > > > > >> > key
> > > > > > > >> > > > in
> > > > > > > >> > > > > the
> > > > > > > >> > > > > > > > >> log compaction topic, the user has to send
> a
> > > > deletion
> > > > > > > >> using
> > > > > > > >> > > the
> > > > > > > >> > > > > same
> > > > > > > >> > > > > > > > key.
> > > > > > > >> > > > > > > > >> Kafka only makes sure the deletion will
> > happen
> > > > under
> > > > > > a
> > > > > > > >> > certain
> > > > > > > >> > > > > time
> > > > > > > >> > > > > > > > >> periods (like 2 days/7 days).
> > > > > > > >> > > > > > > > >>
> > > > > > > >> > > > > > > > >> Regarding your second question. In most
> > cases,
> > > we
> > > > > > might
> > > > > > > >> want
> > > > > > > >> > > to
> > > > > > > >> > > > > delete
> > > > > > > >> > > > > > > > >> all duplicated keys at the same time.
> > > > > > > >> > > > > > > > >> Compaction might be more efficient since we
> > > need
> > > > to
> > > > > > > scan
> > > > > > > >> the
> > > > > > > >> > > log
> > > > > > > >> > > > > and
> > > > > > > >> > > > > > > > find
> > > > > > > >> > > > > > > > >> all duplicates. However, the expected use
> > case
> > > > is to
> > > > > > > set
> > > > > > > >> the
> > > > > > > >> > > > time
> > > > > > > >> > > > > > > based
> > > > > > > >> > > > > > > > >> compaction interval on the order of days,
> and
> > > be
> > > > > > larger
> > > > > > > >> than
> > > > > > > >> > > > 'min
> > > > > > > >> > > > > > > > >> compaction lag". We don't want log
> compaction
> > > to
> > > > > > happen
> > > > > > > >> > > > frequently
> > > > > > > >> > > > > > > since
> > > > > > > >> > > > > > > > >> it is expensive. The purpose is to help low
> > > > > > production
> > > > > > > >> rate
> > > > > > > >> > > > topic
> > > > > > > >> > > > > to
> > > > > > > >> > > > > > > get
> > > > > > > >> > > > > > > > >> compacted on time. For the topic with
> > "normal"
> > > > > > incoming
> > > > > > > >> > > message
> > > > > > > >> > > > > > > message
> > > > > > > >> > > > > > > > >> rate, the "min dirty ratio" might have
> > > triggered
> > > > the
> > > > > > > >> > > compaction
> > > > > > > >> > > > > before
> > > > > > > >> > > > > > > > this
> > > > > > > >> > > > > > > > >> time based compaction policy takes effect.
> > > > > > > >> > > > > > > > >>
> > > > > > > >> > > > > > > > >>
> > > > > > > >> > > > > > > > >> Eno,
> > > > > > > >> > > > > > > > >>
> > > > > > > >> > > > > > > > >> For your question, like I mentioned we have
> > > long
> > > > time
> > > > > > > >> > > retention
> > > > > > > >> > > > > use
> > > > > > > >> > > > > > > case
> > > > > > > >> > > > > > > > >> for log compacted topic, but we want to
> > provide
> > > > > > ability
> > > > > > > >> to
> > > > > > > >> > > > delete
> > > > > > > >> > > > > > > > certain
> > > > > > > >> > > > > > > > >> PII records on time.
> > > > > > > >> > > > > > > > >> Kafka itself doesn't know whether a record
> > > > contains
> > > > > > > >> > sensitive
> > > > > > > >> > > > > > > > information
> > > > > > > >> > > > > > > > >> and relies on the user for deletion.
> > > > > > > >> > > > > > > > >>
> > > > > > > >> > > > > > > > >>
> > > > > > > >> > > > > > > > >> On Mon, Aug 13, 2018 at 6:58 PM, Dong Lin <
> > > > > > > >> > > lindon...@gmail.com>
> > > > > > > >> > > > > > > wrote:
> > > > > > > >> > > > > > > > >>
> > > > > > > >> > > > > > > > >>> Hey Xiongqi,
> > > > > > > >> > > > > > > > >>>
> > > > > > > >> > > > > > > > >>> Thanks for the KIP. I have two questions
> > > > regarding
> > > > > > the
> > > > > > > >> > > use-case
> > > > > > > >> > > > > for
> > > > > > > >> > > > > > > > >>> meeting
> > > > > > > >> > > > > > > > >>> GDPR requirement.
> > > > > > > >> > > > > > > > >>>
> > > > > > > >> > > > > > > > >>> 1) If I recall correctly, one of the GDPR
> > > > > > requirement
> > > > > > > is
> > > > > > > >> > that
> > > > > > > >> > > > we
> > > > > > > >> > > > > can
> > > > > > > >> > > > > > > > not
> > > > > > > >> > > > > > > > >>> keep messages longer than e.g. 30 days in
> > > > storage
> > > > > > > (e.g.
> > > > > > > >> > > Kafka).
> > > > > > > >> > > > > Say
> > > > > > > >> > > > > > > > there
> > > > > > > >> > > > > > > > >>> exists a partition p0 which contains
> > message1
> > > > with
> > > > > > > key1
> > > > > > > >> and
> > > > > > > >> > > > > message2
> > > > > > > >> > > > > > > > with
> > > > > > > >> > > > > > > > >>> key2. And then user keeps producing
> messages
> > > > with
> > > > > > > >> key=key2
> > > > > > > >> > to
> > > > > > > >> > > > > this
> > > > > > > >> > > > > > > > >>> partition. Since message1 with key1 is
> never
> > > > > > > overridden,
> > > > > > > >> > > sooner
> > > > > > > >> > > > > or
> > > > > > > >> > > > > > > > later
> > > > > > > >> > > > > > > > >>> we
> > > > > > > >> > > > > > > > >>> will want to delete message1 and keep the
> > > latest
> > > > > > > message
> > > > > > > >> > with
> > > > > > > >> > > > > > > key=key2.
> > > > > > > >> > > > > > > > >>> But
> > > > > > > >> > > > > > > > >>> currently it looks like log compact logic
> in
> > > > Kafka
> > > > > > > will
> > > > > > > >> > > always
> > > > > > > >> > > > > put
> > > > > > > >> > > > > > > > these
> > > > > > > >> > > > > > > > >>> messages in the same segment. Will this be
> > an
> > > > issue?
> > > > > > > >> > > > > > > > >>>
> > > > > > > >> > > > > > > > >>> 2) The current KIP intends to provide the
> > > > capability
> > > > > > > to
> > > > > > > >> > > delete
> > > > > > > >> > > > a
> > > > > > > >> > > > > > > given
> > > > > > > >> > > > > > > > >>> message in log compacted topic. Does such
> > > > use-case
> > > > > > > also
> > > > > > > >> > > require
> > > > > > > >> > > > > Kafka
> > > > > > > >> > > > > > > > to
> > > > > > > >> > > > > > > > >>> keep the messages produced before the
> given
> > > > message?
> > > > > > > If
> > > > > > > >> > yes,
> > > > > > > >> > > > > then we
> > > > > > > >> > > > > > > > can
> > > > > > > >> > > > > > > > >>> probably just use
> > AdminClient.deleteRecords()
> > > or
> > > > > > > >> time-based
> > > > > > > >> > > log
> > > > > > > >> > > > > > > > retention
> > > > > > > >> > > > > > > > >>> to meet the use-case requirement. If no,
> do
> > > you
> > > > know
> > > > > > > >> what
> > > > > > > >> > is
> > > > > > > >> > > > the
> > > > > > > >> > > > > > > GDPR's
> > > > > > > >> > > > > > > > >>> requirement on time-to-deletion after user
> > > > > > explicitly
> > > > > > > >> > > requests
> > > > > > > >> > > > > the
> > > > > > > >> > > > > > > > >>> deletion
> > > > > > > >> > > > > > > > >>> (e.g. 1 hour, 1 day, 7 day)?
> > > > > > > >> > > > > > > > >>>
> > > > > > > >> > > > > > > > >>> Thanks,
> > > > > > > >> > > > > > > > >>> Dong
> > > > > > > >> > > > > > > > >>>
> > > > > > > >> > > > > > > > >>>
> > > > > > > >> > > > > > > > >>> On Mon, Aug 13, 2018 at 3:44 PM, xiongqi
> wu
> > <
> > > > > > > >> > > > xiongq...@gmail.com
> > > > > > > >> > > > > >
> > > > > > > >> > > > > > > > wrote:
> > > > > > > >> > > > > > > > >>>
> > > > > > > >> > > > > > > > >>> > Hi Eno,
> > > > > > > >> > > > > > > > >>> >
> > > > > > > >> > > > > > > > >>> > The GDPR request we are getting here at
> > > > linkedin
> > > > > > is
> > > > > > > >> if we
> > > > > > > >> > > > get a
> > > > > > > >> > > > > > > > >>> request to
> > > > > > > >> > > > > > > > >>> > delete a record through a null key on a
> > log
> > > > > > > compacted
> > > > > > > >> > > topic,
> > > > > > > >> > > > > > > > >>> > we want to delete the record via
> > compaction
> > > > in a
> > > > > > > given
> > > > > > > >> > time
> > > > > > > >> > > > > period
> > > > > > > >> > > > > > > > >>> like 2
> > > > > > > >> > > > > > > > >>> > days (whatever is required by the
> policy).
> > > > > > > >> > > > > > > > >>> >
> > > > > > > >> > > > > > > > >>> > There might be other issues (such as
> > orphan
> > > > log
> > > > > > > >> segments
> > > > > > > >> > > > under
> > > > > > > >> > > > > > > > certain
> > > > > > > >> > > > > > > > >>> > conditions) that lead to GDPR problem
> but
> > > > they are
> > > > > > > >> more
> > > > > > > >> > > like
> > > > > > > >> > > > > > > > >>> something we
> > > > > > > >> > > > > > > > >>> > need to fix anyway regardless of GDPR.
> > > > > > > >> > > > > > > > >>> >
> > > > > > > >> > > > > > > > >>> >
> > > > > > > >> > > > > > > > >>> > -- Xiongqi (Wesley) Wu
> > > > > > > >> > > > > > > > >>> >
> > > > > > > >> > > > > > > > >>> > On Mon, Aug 13, 2018 at 2:56 PM, Eno
> > > Thereska
> > > > <
> > > > > > > >> > > > > > > > eno.there...@gmail.com>
> > > > > > > >> > > > > > > > >>> > wrote:
> > > > > > > >> > > > > > > > >>> >
> > > > > > > >> > > > > > > > >>> > > Hello,
> > > > > > > >> > > > > > > > >>> > >
> > > > > > > >> > > > > > > > >>> > > Thanks for the KIP. I'd like to see a
> > more
> > > > > > precise
> > > > > > > >> > > > > definition of
> > > > > > > >> > > > > > > > what
> > > > > > > >> > > > > > > > >>> > part
> > > > > > > >> > > > > > > > >>> > > of GDPR you are targeting as well as
> > some
> > > > sort
> > > > > > of
> > > > > > > >> > > > > verification
> > > > > > > >> > > > > > > that
> > > > > > > >> > > > > > > > >>> this
> > > > > > > >> > > > > > > > >>> > > KIP actually addresses the problem.
> > Right
> > > > now I
> > > > > > > find
> > > > > > > >> > > this a
> > > > > > > >> > > > > bit
> > > > > > > >> > > > > > > > >>> vague:
> > > > > > > >> > > > > > > > >>> > >
> > > > > > > >> > > > > > > > >>> > > "Ability to delete a log message
> through
> > > > > > > compaction
> > > > > > > >> in
> > > > > > > >> > a
> > > > > > > >> > > > > timely
> > > > > > > >> > > > > > > > >>> manner
> > > > > > > >> > > > > > > > >>> > has
> > > > > > > >> > > > > > > > >>> > > become an important requirement in
> some
> > > use
> > > > > > cases
> > > > > > > >> > (e.g.,
> > > > > > > >> > > > > GDPR)"
> > > > > > > >> > > > > > > > >>> > >
> > > > > > > >> > > > > > > > >>> > >
> > > > > > > >> > > > > > > > >>> > > Is there any guarantee that after this
> > KIP
> > > > the
> > > > > > > GDPR
> > > > > > > >> > > problem
> > > > > > > >> > > > > is
> > > > > > > >> > > > > > > > >>> solved or
> > > > > > > >> > > > > > > > >>> > do
> > > > > > > >> > > > > > > > >>> > > we need to do something else as well,
> > > e.g.,
> > > > more
> > > > > > > >> KIPs?
> > > > > > > >> > > > > > > > >>> > >
> > > > > > > >> > > > > > > > >>> > >
> > > > > > > >> > > > > > > > >>> > > Thanks
> > > > > > > >> > > > > > > > >>> > >
> > > > > > > >> > > > > > > > >>> > > Eno
> > > > > > > >> > > > > > > > >>> > >
> > > > > > > >> > > > > > > > >>> > >
> > > > > > > >> > > > > > > > >>> > >
> > > > > > > >> > > > > > > > >>> > > On Thu, Aug 9, 2018 at 4:18 PM,
> xiongqi
> > > wu <
> > > > > > > >> > > > > xiongq...@gmail.com>
> > > > > > > >> > > > > > > > >>> wrote:
> > > > > > > >> > > > > > > > >>> > >
> > > > > > > >> > > > > > > > >>> > > > Hi Kafka,
> > > > > > > >> > > > > > > > >>> > > >
> > > > > > > >> > > > > > > > >>> > > > This KIP tries to address GDPR
> concern
> > > to
> > > > > > > fulfill
> > > > > > > >> > > > deletion
> > > > > > > >> > > > > > > > request
> > > > > > > >> > > > > > > > >>> on
> > > > > > > >> > > > > > > > >>> > > time
> > > > > > > >> > > > > > > > >>> > > > through time-based log compaction
> on a
> > > > > > > compaction
> > > > > > > >> > > enabled
> > > > > > > >> > > > > > > topic:
> > > > > > > >> > > > > > > > >>> > > >
> > > > > > > >> > > > > > > > >>> > > >
> > > > > > > >> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > > > > <https://cwiki.apache.org/confluence/display/KAFKA/KIP->
> > > > > > > >> > <https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > > > > <https://cwiki.apache.org/confluence/display/KAFKA/KIP->>
> > > > > > > >> > > > > > > > <
> > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > > > > <https://cwiki.apache.org/confluence/display/KAFKA/KIP->
> > > > > > > >> > <https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > > > > <https://cwiki.apache.org/confluence/display/KAFKA/KIP->>>
> > > > > > > >> > > > > > > > >>> > > >
> > 354%3A+Time-based+log+compaction+policy
> > > > > > > >> > > > > > > > >>> > > >
> > > > > > > >> > > > > > > > >>> > > > Any feedback will be appreciated.
> > > > > > > >> > > > > > > > >>> > > >
> > > > > > > >> > > > > > > > >>> > > >
> > > > > > > >> > > > > > > > >>> > > > Xiongqi (Wesley) Wu
> > > > > > > >> > > > > > > > >>> > > >
> > > > > > > >> > > > > > > > >>> > >
> > > > > > > >> > > > > > > > >>> >
> > > > > > > >> > > > > > > > >>>
> > > > > > > >> > > > > > > > >>
> > > > > > > >> > > > > > > > >>
> > > > > > > >> > > > > > > > >>
> > > > > > > >> > > > > > > > >> --
> > > > > > > >> > > > > > > > >> Xiongqi (Wesley) Wu
> > > > > > > >> > > > > > > > >>
> > > > > > > >> > > > > > > > >
> > > > > > > >> > > > > > > > >
> > > > > > > >> > > > > > > > >
> > > > > > > >> > > > > > > > > --
> > > > > > > >> > > > > > > > > Xiongqi (Wesley) Wu
> > > > > > > >> > > > > > > > >
> > > > > > > >> > > > > > > >
> > > > > > > >> > > > > > > >
> > > > > > > >> > > > > > > >
> > > > > > > >> > > > > > > > --
> > > > > > > >> > > > > > > > Xiongqi (Wesley) Wu
> > > > > > > >> > > > > > > >
> > > > > > > >> > > > > > >
> > > > > > > >> > > > > > >
> > > > > > > >> > > > > > > --
> > > > > > > >> > > > > > >
> > > > > > > >> > > > > > > Brett Rann
> > > > > > > >> > > > > > >
> > > > > > > >> > > > > > > Senior DevOps Engineer
> > > > > > > >> > > > > > >
> > > > > > > >> > > > > > >
> > > > > > > >> > > > > > > Zendesk International Ltd
> > > > > > > >> > > > > > >
> > > > > > > >> > > > > > > 395 Collins Street, Melbourne VIC 3000 Australia
> > > > > > > >> > > > > > >
> > > > > > > >> > > > > > > Mobile: +61 (0) 418 826 017
> > > > > > > >> > > > > > >
> > > > > > > >> > > > > >
> > > > > > > >> > > > > >
> > > > > > > >> > > > > >
> > > > > > > >> > > > > > --
> > > > > > > >> > > > > > Xiongqi (Wesley) Wu
> > > > > > > >> > > > > >
> > > > > > > >> > > > >
> > > > > > > >> > > >
> > > > > > > >> > > >
> > > > > > > >> > > >
> > > > > > > >> > > > --
> > > > > > > >> > > > Xiongqi (Wesley) Wu
> > > > > > > >> > > >
> > > > > > > >> > >
> > > > > > > >> >
> > > > > > > >> >
> > > > > > > >> >
> > > > > > > >> > --
> > > > > > > >> > Xiongqi (Wesley) Wu
> > > > > > > >> >
> > > > > > > >>
> > > > > > > >>
> > > > > > > >> --
> > > > > > > >>
> > > > > > > >> Brett Rann
> > > > > > > >>
> > > > > > > >> Senior DevOps Engineer
> > > > > > > >>
> > > > > > > >>
> > > > > > > >> Zendesk International Ltd
> > > > > > > >>
> > > > > > > >> 395 Collins Street, Melbourne VIC 3000 Australia
> > > > > > > >>
> > > > > > > >> Mobile: +61 (0) 418 826 017
> > > > > > > >>
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Xiongqi (Wesley) Wu
> > > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > >
> > > > > > Brett Rann
> > > > > >
> > > > > > Senior DevOps Engineer
> > > > > >
> > > > > >
> > > > > > Zendesk International Ltd
> > > > > >
> > > > > > 395 Collins Street, Melbourne VIC 3000 Australia
> > > > > >
> > > > > > Mobile: +61 (0) 418 826 017
> > > > > >
> > > >
> > >
> > >
> > > --
> > > -Regards,
> > > Mayuresh R. Gharat
> > > (862) 250-7125
> > >
> >
>

Reply via email to