Hi, Bill,

I replied with the following comments earlier to the thread. Did you see
that?

Thanks for the proposal. A couple of comments.

1. It seems that this new policy should work for CreateTime as well. If a
topic is configured with CreateTime, messages may not be added in strict
order in the log. However, to build a time-based index, we will be
maintaining the largest timestamp for all messages in a log segment. We can
delete a segment if its largest timestamp is less than
log.retention.min.timestamp. This guarantees that no messages newer than
log.retention.min.timestamp will be deleted, which is probably what the
user wants.

2. Right now, the user can specify "delete" as the retention policy and a
log segment will be deleted either when the size of a partition exceeds a
threshold or the timestamp of a segment is older than a relative period of
time (say 7 days) from now. What you are proposing is not a new retention
policy, but an additional check that will cause a segment to be deleted
when the timestamp of a segment is older than an absolute timestamp? If so,
could you update the wiki accordingly?

Thanks,

Jun

On Fri, Feb 19, 2016 at 2:57 PM, Bill Warshaw <wdwars...@gmail.com> wrote:

> Hello all,
>
> What is the next step with this proposal?  The work for KIP-32 that it was
> based off merged earlier today (https://github.com/apache/kafka/pull/764,
> thank you Becket).  I have an implementation with tests, and I've confirmed
> that it actually works in a live system.  Is there more discussion that
> needs to be had about this KIP, or should I start a VOTE thread?
>
>
>
> On Tue, Feb 16, 2016 at 5:06 PM, Jun Rao <j...@confluent.io> wrote:
>
> > Bill,
> >
> > Thanks for the proposal. A couple of comments.
> >
> > 1. It seems that this new policy should work for CreateTime as well. If a
> > topic is configured with CreateTime, messages may not be added in strict
> > order in the log. However, to build a time-based index, we will be
> > maintaining the largest timestamp for all messages in a log segment. We
> can
> > delete a segment if its largest timestamp is less than
> > log.retention.min.timestamp. This guarantees that no messages newer than
> > log.retention.min.timestamp will be deleted, which is probably what the
> > user wants.
> >
> > 2. Right now, the user can specify "delete" as the retention policy and a
> > log segment will be deleted either when the size of a partition exceeds a
> > threshold or the timestamp of a segment is older than a relative period
> of
> > time (say 7 days) from now. What you are proposing is not a new retention
> > policy, but an additional check that will cause a segment to be deleted
> > when the timestamp of a segment is older than an absolute timestamp? If
> so,
> > could you update the wiki accordingly?
> >
> > Jun
> >
> >
> >
> > On Sat, Feb 13, 2016 at 3:23 PM, Bill Warshaw <wdwars...@gmail.com>
> wrote:
> >
> > > Hello,
> > >
> > > That is a good catch, thanks for pointing it out.  If this KIP is
> > accepted,
> > > we'd need to document this and make the log cleaner not run
> > timestamp-based
> > > deletion unless message.timestamp.type=LogAppendTime.
> > >
> > > On Sat, Feb 13, 2016 at 5:38 AM, Andrew Schofield <
> > > andrew_schofield_j...@outlook.com> wrote:
> > >
> > > > This KIP is related to KIP-32, but I strikes me that it only makes
> > sense
> > > > with one of the two proposed message timestamp types. If I understand
> > > > correctly, message timestamps are only certain to be monotonically
> > > > increasing in the log if message.timestamp.type=LogAppendTime.
> > > >
> > > >
> > > >
> > > > Does timestamp-based auto-expiration require use of
> > > > message.timestamp.type=LogAppendTime?
> > > >
> > > >
> > > >
> > > >
> > > > I think this KIP is a good idea, but I think it relies on strict
> > ordering
> > > > of timestamps to be workable.
> > > >
> > > >
> > > >
> > > > Andrew Schofield
> > > >
> > > >
> > > >
> > > >
> > > > > Date: Fri, 12 Feb 2016 10:38:46 -0800
> > > > > Subject: Re: [DISCUSS] KIP-47 - Add timestamp-based log deletion
> > policy
> > > > > From: n...@confluent.io
> > > > > To: dev@kafka.apache.org
> > > > >
> > > > > Adding a timestamp based auto-expiration is useful and this
> proposal
> > > > makes
> > > > > sense. Thx!
> > > > >
> > > > > On Wed, Feb 10, 2016 at 3:35 PM, Jay Kreps  wrote:
> > > > >
> > > > >> I think this makes a lot of sense and won't be hard to implement
> and
> > > > >> doesn't create too much in the way of new interfaces.
> > > > >>
> > > > >> -Jay
> > > > >>
> > > > >> On Tue, Feb 9, 2016 at 8:13 AM, Bill Warshaw  wrote:
> > > > >>
> > > > >>> Hello,
> > > > >>>
> > > > >>> I just submitted KIP-47 for adding a new log deletion policy
> based
> > > on a
> > > > >>> minimum timestamp of messages to retain.
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-47+-+Add+timestamp-based+log+deletion+policy
> > > > >>>
> > > > >>> I'm open to any comments or suggestions.
> > > > >>>
> > > > >>> Thanks,
> > > > >>> Bill Warshaw
> > > > >>>
> > > > >>
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Thanks,
> > > > > Neha
> > > >
> > >
> >
>

Reply via email to