Re: [DISCUSS] KIP-47 - Add timestamp-based log deletion policy

2016-10-07 Thread Becket Qin
g.retention.min.timestamp will be deleted, which is
> > probably
> > > > what
> > > > > > the
> > > > > > > > > user wants.
> > > > > > > > >
> > > > > > > > > 2. Right now, the user can specify "delete" as the
> retention
> > > > policy
> > > > > > > and a
> > > > > > > > > log segment will be deleted either when the size of a
> > partition
> > > > > > > exceeds a
> > > > > > > > > threshold or the timestamp of a segment is older than a
> > > relative
> > > > > > period
> > > > > > > > of
> > > > > > > > > time (say 7 days) from now. What you are proposing is not a
> > new
> > > > > > > retention
> > > > > > > > > policy, but an additional check that will cause a segment
> to
> > be
> > > > > > deleted
> > > > > > > > > when the timestamp of a segment is older than an absolute
> > > > > timestamp?
> > > > > > If
> > > > > > > > so,
> > > > > > > > > could you update the wiki accordingly?
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > >
> > > > > > > > > Jun
> > > > > > > > >
> > > > > > > > > On Fri, Feb 19, 2016 at 2:57 PM, Bill Warshaw <
> > > > wdwars...@gmail.com
> > > > > >
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hello all,
> > > > > > > > > >
> > > > > > > > > > What is the next step with this proposal?  The work for
> > > KIP-32
> > > > > that
> > > > > > > it
> > > > > > > > > was
> > > > > > > > > > based off merged earlier today (
> > > > > > > > https://github.com/apache/kafka/pull/764
> > > > > > > > > ,
> > > > > > > > > > thank you Becket).  I have an implementation with tests,
> > and
> > > > I've
> > > > > > > > > confirmed
> > > > > > > > > > that it actually works in a live system.  Is there more
> > > > > discussion
> > > > > > > that
> > > > > > > > > > needs to be had about this KIP, or should I start a VOTE
> > > > thread?
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Tue, Feb 16, 2016 at 5:06 PM, Jun Rao <
> j...@confluent.io
> > >
> > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Bill,
> > > > > > > > > > >
> > > > > > > > > > > Thanks for the proposal. A couple of comments.
> > > > > > > > > > >
> > > > > > > > > > > 1. It seems that this new policy should work for
> > CreateTime
> > > > as
> > > > > > > well.
> > > > > > > > > If a
> > > > > > > > > > > topic is configured with CreateTime, messages may not
> be
> > > > added
> > > > > in
> > > > > > > > > strict
> > > > > > > > > > > order in the log. However, to build a time-based index,
> > we
> > > > will
> > > > > > be
> > > > > > > > > > > maintaining the largest timestamp for all messages in a
> > log
> > > > > > > segment.
> > > > > > > > We
> > > > > > > > > > can
> > > > > > > > > > > delete a segment if its largest timestamp is less than
> > > > > > > > > > > log.retention.min.timestamp. This guarantees that no
> > > messages
> > > > > > newer
> > > > > > > > > than
> > > > > > > > > > > log.retention.min.timestamp will be deleted, which is
> > > >

Re: [DISCUSS] KIP-47 - Add timestamp-based log deletion policy

2016-06-03 Thread Magnus Edenhill
; > > > > > On Fri, Feb 19, 2016 at 2:57 PM, Bill Warshaw <
> > > wdwars...@gmail.com
> > > > >
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hello all,
> > > > > > > > >
> > > > > > > > > What is the next step with this proposal?  The work for
> > KIP-32
> > > > that
> > > > > > it
> > > > > > > > was
> > > > > > > > > based off merged earlier today (
> > > > > > > https://github.com/apache/kafka/pull/764
> > > > > > > > ,
> > > > > > > > > thank you Becket).  I have an implementation with tests,
> and
> > > I've
> > > > > > > > confirmed
> > > > > > > > > that it actually works in a live system.  Is there more
> > > > discussion
> > > > > > that
> > > > > > > > > needs to be had about this KIP, or should I start a VOTE
> > > thread?
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Tue, Feb 16, 2016 at 5:06 PM, Jun Rao <j...@confluent.io
> >
> > > > wrote:
> > > > > > > > >
> > > > > > > > > > Bill,
> > > > > > > > > >
> > > > > > > > > > Thanks for the proposal. A couple of comments.
> > > > > > > > > >
> > > > > > > > > > 1. It seems that this new policy should work for
> CreateTime
> > > as
> > > > > > well.
> > > > > > > > If a
> > > > > > > > > > topic is configured with CreateTime, messages may not be
> > > added
> > > > in
> > > > > > > > strict
> > > > > > > > > > order in the log. However, to build a time-based index,
> we
> > > will
> > > > > be
> > > > > > > > > > maintaining the largest timestamp for all messages in a
> log
> > > > > > segment.
> > > > > > > We
> > > > > > > > > can
> > > > > > > > > > delete a segment if its largest timestamp is less than
> > > > > > > > > > log.retention.min.timestamp. This guarantees that no
> > messages
> > > > > newer
> > > > > > > > than
> > > > > > > > > > log.retention.min.timestamp will be deleted, which is
> > > probably
> > > > > what
> > > > > > > the
> > > > > > > > > > user wants.
> > > > > > > > > >
> > > > > > > > > > 2. Right now, the user can specify "delete" as the
> > retention
> > > > > policy
> > > > > > > > and a
> > > > > > > > > > log segment will be deleted either when the size of a
> > > partition
> > > > > > > > exceeds a
> > > > > > > > > > threshold or the timestamp of a segment is older than a
> > > > relative
> > > > > > > period
> > > > > > > > > of
> > > > > > > > > > time (say 7 days) from now. What you are proposing is
> not a
> > > new
> > > > > > > > retention
> > > > > > > > > > policy, but an additional check that will cause a segment
> > to
> > > be
> > > > > > > deleted
> > > > > > > > > > when the timestamp of a segment is older than an absolute
> > > > > > timestamp?
> > > > > > > If
> > > > > > > > > so,
> > > > > > > > > > could you update the wiki accordingly?
> > > > > > > > > >
> > > > > > > > > > Jun
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Sat, Feb 13, 2016 at 3:23 PM, Bill Warshaw <
> > > > > wdwars...@gmail.com
> > > > > > >
> > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hello,
> > > > > > > > > > >
> > > > > > > > > > > That is a good catch, thanks for pointing it out.  If
> > this
> > > > KIP
> > > > > is
> > > > > > > > > > accepted,
> > > > > > > > > > > we'd need to document this and make the log cleaner not
> > run
> > > > > > > > > > timestamp-based
> > > > > > > > > > > deletion unless message.timestamp.type=LogAppendTime.
> > > > > > > > > > >
> > > > > > > > > > > On Sat, Feb 13, 2016 at 5:38 AM, Andrew Schofield <
> > > > > > > > > > > andrew_schofield_j...@outlook.com> wrote:
> > > > > > > > > > >
> > > > > > > > > > > > This KIP is related to KIP-32, but I strikes me that
> it
> > > > only
> > > > > > > makes
> > > > > > > > > > sense
> > > > > > > > > > > > with one of the two proposed message timestamp types.
> > If
> > > I
> > > > > > > > understand
> > > > > > > > > > > > correctly, message timestamps are only certain to be
> > > > > > > monotonically
> > > > > > > > > > > > increasing in the log if
> > > > > message.timestamp.type=LogAppendTime.
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Does timestamp-based auto-expiration require use of
> > > > > > > > > > > > message.timestamp.type=LogAppendTime?
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > I think this KIP is a good idea, but I think it
> relies
> > on
> > > > > > strict
> > > > > > > > > > ordering
> > > > > > > > > > > > of timestamps to be workable.
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Andrew Schofield
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > > Date: Fri, 12 Feb 2016 10:38:46 -0800
> > > > > > > > > > > > > Subject: Re: [DISCUSS] KIP-47 - Add timestamp-based
> > log
> > > > > > > deletion
> > > > > > > > > > policy
> > > > > > > > > > > > > From: n...@confluent.io
> > > > > > > > > > > > > To: dev@kafka.apache.org
> > > > > > > > > > > > >
> > > > > > > > > > > > > Adding a timestamp based auto-expiration is useful
> > and
> > > > this
> > > > > > > > > proposal
> > > > > > > > > > > > makes
> > > > > > > > > > > > > sense. Thx!
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Wed, Feb 10, 2016 at 3:35 PM, Jay Kreps  wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > >> I think this makes a lot of sense and won't be
> hard
> > to
> > > > > > > implement
> > > > > > > > > and
> > > > > > > > > > > > >> doesn't create too much in the way of new
> > interfaces.
> > > > > > > > > > > > >>
> > > > > > > > > > > > >> -Jay
> > > > > > > > > > > > >>
> > > > > > > > > > > > >> On Tue, Feb 9, 2016 at 8:13 AM, Bill Warshaw
> wrote:
> > > > > > > > > > > > >>
> > > > > > > > > > > > >>> Hello,
> > > > > > > > > > > > >>>
> > > > > > > > > > > > >>> I just submitted KIP-47 for adding a new log
> > deletion
> > > > > > policy
> > > > > > > > > based
> > > > > > > > > > > on a
> > > > > > > > > > > > >>> minimum timestamp of messages to retain.
> > > > > > > > > > > > >>>
> > > > > > > > > > > > >>>
> > > > > > > > > > > > >>>
> > > > > > > > > > > > >>
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-47+-+Add+timestamp-based+log+deletion+policy
> > > > > > > > > > > > >>>
> > > > > > > > > > > > >>> I'm open to any comments or suggestions.
> > > > > > > > > > > > >>>
> > > > > > > > > > > > >>> Thanks,
> > > > > > > > > > > > >>> Bill Warshaw
> > > > > > > > > > > > >>>
> > > > > > > > > > > > >>
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > --
> > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > Neha
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>


Re: [DISCUSS] KIP-47 - Add timestamp-based log deletion policy

2016-02-23 Thread Joel Koshy
 16, 2016 at 5:06 PM, Jun Rao <j...@confluent.io>
> > > wrote:
> > > > > > > >
> > > > > > > > > Bill,
> > > > > > > > >
> > > > > > > > > Thanks for the proposal. A couple of comments.
> > > > > > > > >
> > > > > > > > > 1. It seems that this new policy should work for CreateTime
> > as
> > > > > well.
> > > > > > > If a
> > > > > > > > > topic is configured with CreateTime, messages may not be
> > added
> > > in
> > > > > > > strict
> > > > > > > > > order in the log. However, to build a time-based index, we
> > will
> > > > be
> > > > > > > > > maintaining the largest timestamp for all messages in a log
> > > > > segment.
> > > > > > We
> > > > > > > > can
> > > > > > > > > delete a segment if its largest timestamp is less than
> > > > > > > > > log.retention.min.timestamp. This guarantees that no
> messages
> > > > newer
> > > > > > > than
> > > > > > > > > log.retention.min.timestamp will be deleted, which is
> > probably
> > > > what
> > > > > > the
> > > > > > > > > user wants.
> > > > > > > > >
> > > > > > > > > 2. Right now, the user can specify "delete" as the
> retention
> > > > policy
> > > > > > > and a
> > > > > > > > > log segment will be deleted either when the size of a
> > partition
> > > > > > > exceeds a
> > > > > > > > > threshold or the timestamp of a segment is older than a
> > > relative
> > > > > > period
> > > > > > > > of
> > > > > > > > > time (say 7 days) from now. What you are proposing is not a
> > new
> > > > > > > retention
> > > > > > > > > policy, but an additional check that will cause a segment
> to
> > be
> > > > > > deleted
> > > > > > > > > when the timestamp of a segment is older than an absolute
> > > > > timestamp?
> > > > > > If
> > > > > > > > so,
> > > > > > > > > could you update the wiki accordingly?
> > > > > > > > >
> > > > > > > > > Jun
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Sat, Feb 13, 2016 at 3:23 PM, Bill Warshaw <
> > > > wdwars...@gmail.com
> > > > > >
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hello,
> > > > > > > > > >
> > > > > > > > > > That is a good catch, thanks for pointing it out.  If
> this
> > > KIP
> > > > is
> > > > > > > > > accepted,
> > > > > > > > > > we'd need to document this and make the log cleaner not
> run
> > > > > > > > > timestamp-based
> > > > > > > > > > deletion unless message.timestamp.type=LogAppendTime.
> > > > > > > > > >
> > > > > > > > > > On Sat, Feb 13, 2016 at 5:38 AM, Andrew Schofield <
> > > > > > > > > > andrew_schofield_j...@outlook.com> wrote:
> > > > > > > > > >
> > > > > > > > > > > This KIP is related to KIP-32, but I strikes me that it
> > > only
> > > > > > makes
> > > > > > > > > sense
> > > > > > > > > > > with one of the two proposed message timestamp types.
> If
> > I
> > > > > > > understand
> > > > > > > > > > > correctly, message timestamps are only certain to be
> > > > > > monotonically
> > > > > > > > > > > increasing in the log if
> > > > message.timestamp.type=LogAppendTime.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Does timestamp-based auto-expiration require use of
> > > > > > > > > > > message.timestamp.type=LogAppendTime?
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > I think this KIP is a good idea, but I think it relies
> on
> > > > > strict
> > > > > > > > > ordering
> > > > > > > > > > > of timestamps to be workable.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Andrew Schofield
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > > Date: Fri, 12 Feb 2016 10:38:46 -0800
> > > > > > > > > > > > Subject: Re: [DISCUSS] KIP-47 - Add timestamp-based
> log
> > > > > > deletion
> > > > > > > > > policy
> > > > > > > > > > > > From: n...@confluent.io
> > > > > > > > > > > > To: dev@kafka.apache.org
> > > > > > > > > > > >
> > > > > > > > > > > > Adding a timestamp based auto-expiration is useful
> and
> > > this
> > > > > > > > proposal
> > > > > > > > > > > makes
> > > > > > > > > > > > sense. Thx!
> > > > > > > > > > > >
> > > > > > > > > > > > On Wed, Feb 10, 2016 at 3:35 PM, Jay Kreps  wrote:
> > > > > > > > > > > >
> > > > > > > > > > > >> I think this makes a lot of sense and won't be hard
> to
> > > > > > implement
> > > > > > > > and
> > > > > > > > > > > >> doesn't create too much in the way of new
> interfaces.
> > > > > > > > > > > >>
> > > > > > > > > > > >> -Jay
> > > > > > > > > > > >>
> > > > > > > > > > > >> On Tue, Feb 9, 2016 at 8:13 AM, Bill Warshaw  wrote:
> > > > > > > > > > > >>
> > > > > > > > > > > >>> Hello,
> > > > > > > > > > > >>>
> > > > > > > > > > > >>> I just submitted KIP-47 for adding a new log
> deletion
> > > > > policy
> > > > > > > > based
> > > > > > > > > > on a
> > > > > > > > > > > >>> minimum timestamp of messages to retain.
> > > > > > > > > > > >>>
> > > > > > > > > > > >>>
> > > > > > > > > > > >>>
> > > > > > > > > > > >>
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-47+-+Add+timestamp-based+log+deletion+policy
> > > > > > > > > > > >>>
> > > > > > > > > > > >>> I'm open to any comments or suggestions.
> > > > > > > > > > > >>>
> > > > > > > > > > > >>> Thanks,
> > > > > > > > > > > >>> Bill Warshaw
> > > > > > > > > > > >>>
> > > > > > > > > > > >>
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > --
> > > > > > > > > > > > Thanks,
> > > > > > > > > > > > Neha
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>


Re: [DISCUSS] KIP-47 - Add timestamp-based log deletion policy

2016-02-23 Thread Bill Warshaw
ewer
> > > > > > than
> > > > > > > > log.retention.min.timestamp will be deleted, which is
> probably
> > > what
> > > > > the
> > > > > > > > user wants.
> > > > > > > >
> > > > > > > > 2. Right now, the user can specify "delete" as the retention
> > > policy
> > > > > > and a
> > > > > > > > log segment will be deleted either when the size of a
> partition
> > > > > > exceeds a
> > > > > > > > threshold or the timestamp of a segment is older than a
> > relative
> > > > > period
> > > > > > > of
> > > > > > > > time (say 7 days) from now. What you are proposing is not a
> new
> > > > > > retention
> > > > > > > > policy, but an additional check that will cause a segment to
> be
> > > > > deleted
> > > > > > > > when the timestamp of a segment is older than an absolute
> > > > timestamp?
> > > > > If
> > > > > > > so,
> > > > > > > > could you update the wiki accordingly?
> > > > > > > >
> > > > > > > > Jun
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > On Sat, Feb 13, 2016 at 3:23 PM, Bill Warshaw <
> > > wdwars...@gmail.com
> > > > >
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hello,
> > > > > > > > >
> > > > > > > > > That is a good catch, thanks for pointing it out.  If this
> > KIP
> > > is
> > > > > > > > accepted,
> > > > > > > > > we'd need to document this and make the log cleaner not run
> > > > > > > > timestamp-based
> > > > > > > > > deletion unless message.timestamp.type=LogAppendTime.
> > > > > > > > >
> > > > > > > > > On Sat, Feb 13, 2016 at 5:38 AM, Andrew Schofield <
> > > > > > > > > andrew_schofield_j...@outlook.com> wrote:
> > > > > > > > >
> > > > > > > > > > This KIP is related to KIP-32, but I strikes me that it
> > only
> > > > > makes
> > > > > > > > sense
> > > > > > > > > > with one of the two proposed message timestamp types. If
> I
> > > > > > understand
> > > > > > > > > > correctly, message timestamps are only certain to be
> > > > > monotonically
> > > > > > > > > > increasing in the log if
> > > message.timestamp.type=LogAppendTime.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Does timestamp-based auto-expiration require use of
> > > > > > > > > > message.timestamp.type=LogAppendTime?
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > I think this KIP is a good idea, but I think it relies on
> > > > strict
> > > > > > > > ordering
> > > > > > > > > > of timestamps to be workable.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Andrew Schofield
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > > Date: Fri, 12 Feb 2016 10:38:46 -0800
> > > > > > > > > > > Subject: Re: [DISCUSS] KIP-47 - Add timestamp-based log
> > > > > deletion
> > > > > > > > policy
> > > > > > > > > > > From: n...@confluent.io
> > > > > > > > > > > To: dev@kafka.apache.org
> > > > > > > > > > >
> > > > > > > > > > > Adding a timestamp based auto-expiration is useful and
> > this
> > > > > > > proposal
> > > > > > > > > > makes
> > > > > > > > > > > sense. Thx!
> > > > > > > > > > >
> > > > > > > > > > > On Wed, Feb 10, 2016 at 3:35 PM, Jay Kreps  wrote:
> > > > > > > > > > >
> > > > > > > > > > >> I think this makes a lot of sense and won't be hard to
> > > > > implement
> > > > > > > and
> > > > > > > > > > >> doesn't create too much in the way of new interfaces.
> > > > > > > > > > >>
> > > > > > > > > > >> -Jay
> > > > > > > > > > >>
> > > > > > > > > > >> On Tue, Feb 9, 2016 at 8:13 AM, Bill Warshaw  wrote:
> > > > > > > > > > >>
> > > > > > > > > > >>> Hello,
> > > > > > > > > > >>>
> > > > > > > > > > >>> I just submitted KIP-47 for adding a new log deletion
> > > > policy
> > > > > > > based
> > > > > > > > > on a
> > > > > > > > > > >>> minimum timestamp of messages to retain.
> > > > > > > > > > >>>
> > > > > > > > > > >>>
> > > > > > > > > > >>>
> > > > > > > > > > >>
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-47+-+Add+timestamp-based+log+deletion+policy
> > > > > > > > > > >>>
> > > > > > > > > > >>> I'm open to any comments or suggestions.
> > > > > > > > > > >>>
> > > > > > > > > > >>> Thanks,
> > > > > > > > > > >>> Bill Warshaw
> > > > > > > > > > >>>
> > > > > > > > > > >>
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > --
> > > > > > > > > > > Thanks,
> > > > > > > > > > > Neha
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>


Re: [DISCUSS] KIP-47 - Add timestamp-based log deletion policy

2016-02-23 Thread Joel Koshy
ted
> > > > > when the timestamp of a segment is older than an absolute
> timestamp?
> > If
> > > > so,
> > > > > could you update the wiki accordingly?
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Jun
> > > > >
> > > > > On Fri, Feb 19, 2016 at 2:57 PM, Bill Warshaw <wdwars...@gmail.com
> >
> > > > wrote:
> > > > >
> > > > > > Hello all,
> > > > > >
> > > > > > What is the next step with this proposal?  The work for KIP-32
> that
> > > it
> > > > > was
> > > > > > based off merged earlier today (
> > > > https://github.com/apache/kafka/pull/764
> > > > > ,
> > > > > > thank you Becket).  I have an implementation with tests, and I've
> > > > > confirmed
> > > > > > that it actually works in a live system.  Is there more
> discussion
> > > that
> > > > > > needs to be had about this KIP, or should I start a VOTE thread?
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Tue, Feb 16, 2016 at 5:06 PM, Jun Rao <j...@confluent.io>
> wrote:
> > > > > >
> > > > > > > Bill,
> > > > > > >
> > > > > > > Thanks for the proposal. A couple of comments.
> > > > > > >
> > > > > > > 1. It seems that this new policy should work for CreateTime as
> > > well.
> > > > > If a
> > > > > > > topic is configured with CreateTime, messages may not be added
> in
> > > > > strict
> > > > > > > order in the log. However, to build a time-based index, we will
> > be
> > > > > > > maintaining the largest timestamp for all messages in a log
> > > segment.
> > > > We
> > > > > > can
> > > > > > > delete a segment if its largest timestamp is less than
> > > > > > > log.retention.min.timestamp. This guarantees that no messages
> > newer
> > > > > than
> > > > > > > log.retention.min.timestamp will be deleted, which is probably
> > what
> > > > the
> > > > > > > user wants.
> > > > > > >
> > > > > > > 2. Right now, the user can specify "delete" as the retention
> > policy
> > > > > and a
> > > > > > > log segment will be deleted either when the size of a partition
> > > > > exceeds a
> > > > > > > threshold or the timestamp of a segment is older than a
> relative
> > > > period
> > > > > > of
> > > > > > > time (say 7 days) from now. What you are proposing is not a new
> > > > > retention
> > > > > > > policy, but an additional check that will cause a segment to be
> > > > deleted
> > > > > > > when the timestamp of a segment is older than an absolute
> > > timestamp?
> > > > If
> > > > > > so,
> > > > > > > could you update the wiki accordingly?
> > > > > > >
> > > > > > > Jun
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Sat, Feb 13, 2016 at 3:23 PM, Bill Warshaw <
> > wdwars...@gmail.com
> > > >
> > > > > > wrote:
> > > > > > >
> > > > > > > > Hello,
> > > > > > > >
> > > > > > > > That is a good catch, thanks for pointing it out.  If this
> KIP
> > is
> > > > > > > accepted,
> > > > > > > > we'd need to document this and make the log cleaner not run
> > > > > > > timestamp-based
> > > > > > > > deletion unless message.timestamp.type=LogAppendTime.
> > > > > > > >
> > > > > > > > On Sat, Feb 13, 2016 at 5:38 AM, Andrew Schofield <
> > > > > > > > andrew_schofield_j...@outlook.com> wrote:
> > > > > > > >
> > > > > > > > > This KIP is related to KIP-32, but I strikes me that it
> only
> > > > makes
> > > > > > > sense
> > > > > > > > > with one of the two proposed message timestamp types. If I
> > > > > understand
> > > > > > > > > correctly, message timestamps are only certain to be
> > > > monotonically
> > > > > > > > > increasing in the log if
> > message.timestamp.type=LogAppendTime.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Does timestamp-based auto-expiration require use of
> > > > > > > > > message.timestamp.type=LogAppendTime?
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > I think this KIP is a good idea, but I think it relies on
> > > strict
> > > > > > > ordering
> > > > > > > > > of timestamps to be workable.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Andrew Schofield
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > > Date: Fri, 12 Feb 2016 10:38:46 -0800
> > > > > > > > > > Subject: Re: [DISCUSS] KIP-47 - Add timestamp-based log
> > > > deletion
> > > > > > > policy
> > > > > > > > > > From: n...@confluent.io
> > > > > > > > > > To: dev@kafka.apache.org
> > > > > > > > > >
> > > > > > > > > > Adding a timestamp based auto-expiration is useful and
> this
> > > > > > proposal
> > > > > > > > > makes
> > > > > > > > > > sense. Thx!
> > > > > > > > > >
> > > > > > > > > > On Wed, Feb 10, 2016 at 3:35 PM, Jay Kreps  wrote:
> > > > > > > > > >
> > > > > > > > > >> I think this makes a lot of sense and won't be hard to
> > > > implement
> > > > > > and
> > > > > > > > > >> doesn't create too much in the way of new interfaces.
> > > > > > > > > >>
> > > > > > > > > >> -Jay
> > > > > > > > > >>
> > > > > > > > > >> On Tue, Feb 9, 2016 at 8:13 AM, Bill Warshaw  wrote:
> > > > > > > > > >>
> > > > > > > > > >>> Hello,
> > > > > > > > > >>>
> > > > > > > > > >>> I just submitted KIP-47 for adding a new log deletion
> > > policy
> > > > > > based
> > > > > > > > on a
> > > > > > > > > >>> minimum timestamp of messages to retain.
> > > > > > > > > >>>
> > > > > > > > > >>>
> > > > > > > > > >>>
> > > > > > > > > >>
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-47+-+Add+timestamp-based+log+deletion+policy
> > > > > > > > > >>>
> > > > > > > > > >>> I'm open to any comments or suggestions.
> > > > > > > > > >>>
> > > > > > > > > >>> Thanks,
> > > > > > > > > >>> Bill Warshaw
> > > > > > > > > >>>
> > > > > > > > > >>
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > > Thanks,
> > > > > > > > > > Neha
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>


Re: [DISCUSS] KIP-47 - Add timestamp-based log deletion policy

2016-02-22 Thread Bill Warshaw
 >
> > > > > > 1. It seems that this new policy should work for CreateTime as
> > well.
> > > > If a
> > > > > > topic is configured with CreateTime, messages may not be added in
> > > > strict
> > > > > > order in the log. However, to build a time-based index, we will
> be
> > > > > > maintaining the largest timestamp for all messages in a log
> > segment.
> > > We
> > > > > can
> > > > > > delete a segment if its largest timestamp is less than
> > > > > > log.retention.min.timestamp. This guarantees that no messages
> newer
> > > > than
> > > > > > log.retention.min.timestamp will be deleted, which is probably
> what
> > > the
> > > > > > user wants.
> > > > > >
> > > > > > 2. Right now, the user can specify "delete" as the retention
> policy
> > > > and a
> > > > > > log segment will be deleted either when the size of a partition
> > > > exceeds a
> > > > > > threshold or the timestamp of a segment is older than a relative
> > > period
> > > > > of
> > > > > > time (say 7 days) from now. What you are proposing is not a new
> > > > retention
> > > > > > policy, but an additional check that will cause a segment to be
> > > deleted
> > > > > > when the timestamp of a segment is older than an absolute
> > timestamp?
> > > If
> > > > > so,
> > > > > > could you update the wiki accordingly?
> > > > > >
> > > > > > Jun
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Sat, Feb 13, 2016 at 3:23 PM, Bill Warshaw <
> wdwars...@gmail.com
> > >
> > > > > wrote:
> > > > > >
> > > > > > > Hello,
> > > > > > >
> > > > > > > That is a good catch, thanks for pointing it out.  If this KIP
> is
> > > > > > accepted,
> > > > > > > we'd need to document this and make the log cleaner not run
> > > > > > timestamp-based
> > > > > > > deletion unless message.timestamp.type=LogAppendTime.
> > > > > > >
> > > > > > > On Sat, Feb 13, 2016 at 5:38 AM, Andrew Schofield <
> > > > > > > andrew_schofield_j...@outlook.com> wrote:
> > > > > > >
> > > > > > > > This KIP is related to KIP-32, but I strikes me that it only
> > > makes
> > > > > > sense
> > > > > > > > with one of the two proposed message timestamp types. If I
> > > > understand
> > > > > > > > correctly, message timestamps are only certain to be
> > > monotonically
> > > > > > > > increasing in the log if
> message.timestamp.type=LogAppendTime.
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > Does timestamp-based auto-expiration require use of
> > > > > > > > message.timestamp.type=LogAppendTime?
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > I think this KIP is a good idea, but I think it relies on
> > strict
> > > > > > ordering
> > > > > > > > of timestamps to be workable.
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > Andrew Schofield
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > > Date: Fri, 12 Feb 2016 10:38:46 -0800
> > > > > > > > > Subject: Re: [DISCUSS] KIP-47 - Add timestamp-based log
> > > deletion
> > > > > > policy
> > > > > > > > > From: n...@confluent.io
> > > > > > > > > To: dev@kafka.apache.org
> > > > > > > > >
> > > > > > > > > Adding a timestamp based auto-expiration is useful and this
> > > > > proposal
> > > > > > > > makes
> > > > > > > > > sense. Thx!
> > > > > > > > >
> > > > > > > > > On Wed, Feb 10, 2016 at 3:35 PM, Jay Kreps  wrote:
> > > > > > > > >
> > > > > > > > >> I think this makes a lot of sense and won't be hard to
> > > implement
> > > > > and
> > > > > > > > >> doesn't create too much in the way of new interfaces.
> > > > > > > > >>
> > > > > > > > >> -Jay
> > > > > > > > >>
> > > > > > > > >> On Tue, Feb 9, 2016 at 8:13 AM, Bill Warshaw  wrote:
> > > > > > > > >>
> > > > > > > > >>> Hello,
> > > > > > > > >>>
> > > > > > > > >>> I just submitted KIP-47 for adding a new log deletion
> > policy
> > > > > based
> > > > > > > on a
> > > > > > > > >>> minimum timestamp of messages to retain.
> > > > > > > > >>>
> > > > > > > > >>>
> > > > > > > > >>>
> > > > > > > > >>
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-47+-+Add+timestamp-based+log+deletion+policy
> > > > > > > > >>>
> > > > > > > > >>> I'm open to any comments or suggestions.
> > > > > > > > >>>
> > > > > > > > >>> Thanks,
> > > > > > > > >>> Bill Warshaw
> > > > > > > > >>>
> > > > > > > > >>
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > Thanks,
> > > > > > > > > Neha
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>


Re: [DISCUSS] KIP-47 - Add timestamp-based log deletion policy

2016-02-22 Thread Becket Qin
; > > > >
> > > > > 2. Right now, the user can specify "delete" as the retention policy
> > > and a
> > > > > log segment will be deleted either when the size of a partition
> > > exceeds a
> > > > > threshold or the timestamp of a segment is older than a relative
> > period
> > > > of
> > > > > time (say 7 days) from now. What you are proposing is not a new
> > > retention
> > > > > policy, but an additional check that will cause a segment to be
> > deleted
> > > > > when the timestamp of a segment is older than an absolute
> timestamp?
> > If
> > > > so,
> > > > > could you update the wiki accordingly?
> > > > >
> > > > > Jun
> > > > >
> > > > >
> > > > >
> > > > > On Sat, Feb 13, 2016 at 3:23 PM, Bill Warshaw <wdwars...@gmail.com
> >
> > > > wrote:
> > > > >
> > > > > > Hello,
> > > > > >
> > > > > > That is a good catch, thanks for pointing it out.  If this KIP is
> > > > > accepted,
> > > > > > we'd need to document this and make the log cleaner not run
> > > > > timestamp-based
> > > > > > deletion unless message.timestamp.type=LogAppendTime.
> > > > > >
> > > > > > On Sat, Feb 13, 2016 at 5:38 AM, Andrew Schofield <
> > > > > > andrew_schofield_j...@outlook.com> wrote:
> > > > > >
> > > > > > > This KIP is related to KIP-32, but I strikes me that it only
> > makes
> > > > > sense
> > > > > > > with one of the two proposed message timestamp types. If I
> > > understand
> > > > > > > correctly, message timestamps are only certain to be
> > monotonically
> > > > > > > increasing in the log if message.timestamp.type=LogAppendTime.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Does timestamp-based auto-expiration require use of
> > > > > > > message.timestamp.type=LogAppendTime?
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > I think this KIP is a good idea, but I think it relies on
> strict
> > > > > ordering
> > > > > > > of timestamps to be workable.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Andrew Schofield
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > > Date: Fri, 12 Feb 2016 10:38:46 -0800
> > > > > > > > Subject: Re: [DISCUSS] KIP-47 - Add timestamp-based log
> > deletion
> > > > > policy
> > > > > > > > From: n...@confluent.io
> > > > > > > > To: dev@kafka.apache.org
> > > > > > > >
> > > > > > > > Adding a timestamp based auto-expiration is useful and this
> > > > proposal
> > > > > > > makes
> > > > > > > > sense. Thx!
> > > > > > > >
> > > > > > > > On Wed, Feb 10, 2016 at 3:35 PM, Jay Kreps  wrote:
> > > > > > > >
> > > > > > > >> I think this makes a lot of sense and won't be hard to
> > implement
> > > > and
> > > > > > > >> doesn't create too much in the way of new interfaces.
> > > > > > > >>
> > > > > > > >> -Jay
> > > > > > > >>
> > > > > > > >> On Tue, Feb 9, 2016 at 8:13 AM, Bill Warshaw  wrote:
> > > > > > > >>
> > > > > > > >>> Hello,
> > > > > > > >>>
> > > > > > > >>> I just submitted KIP-47 for adding a new log deletion
> policy
> > > > based
> > > > > > on a
> > > > > > > >>> minimum timestamp of messages to retain.
> > > > > > > >>>
> > > > > > > >>>
> > > > > > > >>>
> > > > > > > >>
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-47+-+Add+timestamp-based+log+deletion+policy
> > > > > > > >>>
> > > > > > > >>> I'm open to any comments or suggestions.
> > > > > > > >>>
> > > > > > > >>> Thanks,
> > > > > > > >>> Bill Warshaw
> > > > > > > >>>
> > > > > > > >>
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > Thanks,
> > > > > > > > Neha
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>


Re: [DISCUSS] KIP-47 - Add timestamp-based log deletion policy

2016-02-22 Thread Jun Rao
, Bill Warshaw <wdwars...@gmail.com>
> > > wrote:
> > > >
> > > > > Hello,
> > > > >
> > > > > That is a good catch, thanks for pointing it out.  If this KIP is
> > > > accepted,
> > > > > we'd need to document this and make the log cleaner not run
> > > > timestamp-based
> > > > > deletion unless message.timestamp.type=LogAppendTime.
> > > > >
> > > > > On Sat, Feb 13, 2016 at 5:38 AM, Andrew Schofield <
> > > > > andrew_schofield_j...@outlook.com> wrote:
> > > > >
> > > > > > This KIP is related to KIP-32, but I strikes me that it only
> makes
> > > > sense
> > > > > > with one of the two proposed message timestamp types. If I
> > understand
> > > > > > correctly, message timestamps are only certain to be
> monotonically
> > > > > > increasing in the log if message.timestamp.type=LogAppendTime.
> > > > > >
> > > > > >
> > > > > >
> > > > > > Does timestamp-based auto-expiration require use of
> > > > > > message.timestamp.type=LogAppendTime?
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > I think this KIP is a good idea, but I think it relies on strict
> > > > ordering
> > > > > > of timestamps to be workable.
> > > > > >
> > > > > >
> > > > > >
> > > > > > Andrew Schofield
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > > Date: Fri, 12 Feb 2016 10:38:46 -0800
> > > > > > > Subject: Re: [DISCUSS] KIP-47 - Add timestamp-based log
> deletion
> > > > policy
> > > > > > > From: n...@confluent.io
> > > > > > > To: dev@kafka.apache.org
> > > > > > >
> > > > > > > Adding a timestamp based auto-expiration is useful and this
> > > proposal
> > > > > > makes
> > > > > > > sense. Thx!
> > > > > > >
> > > > > > > On Wed, Feb 10, 2016 at 3:35 PM, Jay Kreps  wrote:
> > > > > > >
> > > > > > >> I think this makes a lot of sense and won't be hard to
> implement
> > > and
> > > > > > >> doesn't create too much in the way of new interfaces.
> > > > > > >>
> > > > > > >> -Jay
> > > > > > >>
> > > > > > >> On Tue, Feb 9, 2016 at 8:13 AM, Bill Warshaw  wrote:
> > > > > > >>
> > > > > > >>> Hello,
> > > > > > >>>
> > > > > > >>> I just submitted KIP-47 for adding a new log deletion policy
> > > based
> > > > > on a
> > > > > > >>> minimum timestamp of messages to retain.
> > > > > > >>>
> > > > > > >>>
> > > > > > >>>
> > > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-47+-+Add+timestamp-based+log+deletion+policy
> > > > > > >>>
> > > > > > >>> I'm open to any comments or suggestions.
> > > > > > >>>
> > > > > > >>> Thanks,
> > > > > > >>> Bill Warshaw
> > > > > > >>>
> > > > > > >>
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Thanks,
> > > > > > > Neha
> > > > > >
> > > > >
> > > >
> > >
> >
>


Re: [DISCUSS] KIP-47 - Add timestamp-based log deletion policy

2016-02-19 Thread Bill Warshaw
Hi Jun,

1.  I thought more about Andrew's comment about LogAppendTime.  The
time-based index you are referring to is associated with KIP-33, correct?
Currently my implementation is just checking the last message in a segment,
so we're restricted to LogAppendTime.  When the work for KIP-33 is
completed, it sounds like CreateTime would also be valid.  Do you happen to
know if anyone is currently working on KIP-33?

2. I did update the wiki after reading your original comment, but reading
over it again I realize I could word a couple things more clearly.  I will
do that tonight.

Bill

On Fri, Feb 19, 2016 at 7:02 PM, Jun Rao <j...@confluent.io> wrote:

> Hi, Bill,
>
> I replied with the following comments earlier to the thread. Did you see
> that?
>
> Thanks for the proposal. A couple of comments.
>
> 1. It seems that this new policy should work for CreateTime as well. If a
> topic is configured with CreateTime, messages may not be added in strict
> order in the log. However, to build a time-based index, we will be
> maintaining the largest timestamp for all messages in a log segment. We can
> delete a segment if its largest timestamp is less than
> log.retention.min.timestamp. This guarantees that no messages newer than
> log.retention.min.timestamp will be deleted, which is probably what the
> user wants.
>
> 2. Right now, the user can specify "delete" as the retention policy and a
> log segment will be deleted either when the size of a partition exceeds a
> threshold or the timestamp of a segment is older than a relative period of
> time (say 7 days) from now. What you are proposing is not a new retention
> policy, but an additional check that will cause a segment to be deleted
> when the timestamp of a segment is older than an absolute timestamp? If so,
> could you update the wiki accordingly?
>
> Thanks,
>
> Jun
>
> On Fri, Feb 19, 2016 at 2:57 PM, Bill Warshaw <wdwars...@gmail.com> wrote:
>
> > Hello all,
> >
> > What is the next step with this proposal?  The work for KIP-32 that it
> was
> > based off merged earlier today (https://github.com/apache/kafka/pull/764
> ,
> > thank you Becket).  I have an implementation with tests, and I've
> confirmed
> > that it actually works in a live system.  Is there more discussion that
> > needs to be had about this KIP, or should I start a VOTE thread?
> >
> >
> >
> > On Tue, Feb 16, 2016 at 5:06 PM, Jun Rao <j...@confluent.io> wrote:
> >
> > > Bill,
> > >
> > > Thanks for the proposal. A couple of comments.
> > >
> > > 1. It seems that this new policy should work for CreateTime as well.
> If a
> > > topic is configured with CreateTime, messages may not be added in
> strict
> > > order in the log. However, to build a time-based index, we will be
> > > maintaining the largest timestamp for all messages in a log segment. We
> > can
> > > delete a segment if its largest timestamp is less than
> > > log.retention.min.timestamp. This guarantees that no messages newer
> than
> > > log.retention.min.timestamp will be deleted, which is probably what the
> > > user wants.
> > >
> > > 2. Right now, the user can specify "delete" as the retention policy
> and a
> > > log segment will be deleted either when the size of a partition
> exceeds a
> > > threshold or the timestamp of a segment is older than a relative period
> > of
> > > time (say 7 days) from now. What you are proposing is not a new
> retention
> > > policy, but an additional check that will cause a segment to be deleted
> > > when the timestamp of a segment is older than an absolute timestamp? If
> > so,
> > > could you update the wiki accordingly?
> > >
> > > Jun
> > >
> > >
> > >
> > > On Sat, Feb 13, 2016 at 3:23 PM, Bill Warshaw <wdwars...@gmail.com>
> > wrote:
> > >
> > > > Hello,
> > > >
> > > > That is a good catch, thanks for pointing it out.  If this KIP is
> > > accepted,
> > > > we'd need to document this and make the log cleaner not run
> > > timestamp-based
> > > > deletion unless message.timestamp.type=LogAppendTime.
> > > >
> > > > On Sat, Feb 13, 2016 at 5:38 AM, Andrew Schofield <
> > > > andrew_schofield_j...@outlook.com> wrote:
> > > >
> > > > > This KIP is related to KIP-32, but I strikes me that it only makes
> > > sense
> > > > > with one of the two proposed message timestamp types. If I
> understand
> > > > &g

Re: [DISCUSS] KIP-47 - Add timestamp-based log deletion policy

2016-02-19 Thread Bill Warshaw
 now. What you are proposing is not a new
> > retention
> > > > policy, but an additional check that will cause a segment to be
> deleted
> > > > when the timestamp of a segment is older than an absolute timestamp?
> If
> > > so,
> > > > could you update the wiki accordingly?
> > > >
> > > > Jun
> > > >
> > > >
> > > >
> > > > On Sat, Feb 13, 2016 at 3:23 PM, Bill Warshaw <wdwars...@gmail.com>
> > > wrote:
> > > >
> > > > > Hello,
> > > > >
> > > > > That is a good catch, thanks for pointing it out.  If this KIP is
> > > > accepted,
> > > > > we'd need to document this and make the log cleaner not run
> > > > timestamp-based
> > > > > deletion unless message.timestamp.type=LogAppendTime.
> > > > >
> > > > > On Sat, Feb 13, 2016 at 5:38 AM, Andrew Schofield <
> > > > > andrew_schofield_j...@outlook.com> wrote:
> > > > >
> > > > > > This KIP is related to KIP-32, but I strikes me that it only
> makes
> > > > sense
> > > > > > with one of the two proposed message timestamp types. If I
> > understand
> > > > > > correctly, message timestamps are only certain to be
> monotonically
> > > > > > increasing in the log if message.timestamp.type=LogAppendTime.
> > > > > >
> > > > > >
> > > > > >
> > > > > > Does timestamp-based auto-expiration require use of
> > > > > > message.timestamp.type=LogAppendTime?
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > I think this KIP is a good idea, but I think it relies on strict
> > > > ordering
> > > > > > of timestamps to be workable.
> > > > > >
> > > > > >
> > > > > >
> > > > > > Andrew Schofield
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > > Date: Fri, 12 Feb 2016 10:38:46 -0800
> > > > > > > Subject: Re: [DISCUSS] KIP-47 - Add timestamp-based log
> deletion
> > > > policy
> > > > > > > From: n...@confluent.io
> > > > > > > To: dev@kafka.apache.org
> > > > > > >
> > > > > > > Adding a timestamp based auto-expiration is useful and this
> > > proposal
> > > > > > makes
> > > > > > > sense. Thx!
> > > > > > >
> > > > > > > On Wed, Feb 10, 2016 at 3:35 PM, Jay Kreps  wrote:
> > > > > > >
> > > > > > >> I think this makes a lot of sense and won't be hard to
> implement
> > > and
> > > > > > >> doesn't create too much in the way of new interfaces.
> > > > > > >>
> > > > > > >> -Jay
> > > > > > >>
> > > > > > >> On Tue, Feb 9, 2016 at 8:13 AM, Bill Warshaw  wrote:
> > > > > > >>
> > > > > > >>> Hello,
> > > > > > >>>
> > > > > > >>> I just submitted KIP-47 for adding a new log deletion policy
> > > based
> > > > > on a
> > > > > > >>> minimum timestamp of messages to retain.
> > > > > > >>>
> > > > > > >>>
> > > > > > >>>
> > > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-47+-+Add+timestamp-based+log+deletion+policy
> > > > > > >>>
> > > > > > >>> I'm open to any comments or suggestions.
> > > > > > >>>
> > > > > > >>> Thanks,
> > > > > > >>> Bill Warshaw
> > > > > > >>>
> > > > > > >>
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Thanks,
> > > > > > > Neha
> > > > > >
> > > > >
> > > >
> > >
> >
>


Re: [DISCUSS] KIP-47 - Add timestamp-based log deletion policy

2016-02-19 Thread Becket Qin
t;
> > > > > This KIP is related to KIP-32, but I strikes me that it only makes
> > > sense
> > > > > with one of the two proposed message timestamp types. If I
> understand
> > > > > correctly, message timestamps are only certain to be monotonically
> > > > > increasing in the log if message.timestamp.type=LogAppendTime.
> > > > >
> > > > >
> > > > >
> > > > > Does timestamp-based auto-expiration require use of
> > > > > message.timestamp.type=LogAppendTime?
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > I think this KIP is a good idea, but I think it relies on strict
> > > ordering
> > > > > of timestamps to be workable.
> > > > >
> > > > >
> > > > >
> > > > > Andrew Schofield
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > > Date: Fri, 12 Feb 2016 10:38:46 -0800
> > > > > > Subject: Re: [DISCUSS] KIP-47 - Add timestamp-based log deletion
> > > policy
> > > > > > From: n...@confluent.io
> > > > > > To: dev@kafka.apache.org
> > > > > >
> > > > > > Adding a timestamp based auto-expiration is useful and this
> > proposal
> > > > > makes
> > > > > > sense. Thx!
> > > > > >
> > > > > > On Wed, Feb 10, 2016 at 3:35 PM, Jay Kreps  wrote:
> > > > > >
> > > > > >> I think this makes a lot of sense and won't be hard to implement
> > and
> > > > > >> doesn't create too much in the way of new interfaces.
> > > > > >>
> > > > > >> -Jay
> > > > > >>
> > > > > >> On Tue, Feb 9, 2016 at 8:13 AM, Bill Warshaw  wrote:
> > > > > >>
> > > > > >>> Hello,
> > > > > >>>
> > > > > >>> I just submitted KIP-47 for adding a new log deletion policy
> > based
> > > > on a
> > > > > >>> minimum timestamp of messages to retain.
> > > > > >>>
> > > > > >>>
> > > > > >>>
> > > > > >>
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-47+-+Add+timestamp-based+log+deletion+policy
> > > > > >>>
> > > > > >>> I'm open to any comments or suggestions.
> > > > > >>>
> > > > > >>> Thanks,
> > > > > >>> Bill Warshaw
> > > > > >>>
> > > > > >>
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Thanks,
> > > > > > Neha
> > > > >
> > > >
> > >
> >
>


Re: [DISCUSS] KIP-47 - Add timestamp-based log deletion policy

2016-02-19 Thread Jun Rao
Hi, Bill,

I replied with the following comments earlier to the thread. Did you see
that?

Thanks for the proposal. A couple of comments.

1. It seems that this new policy should work for CreateTime as well. If a
topic is configured with CreateTime, messages may not be added in strict
order in the log. However, to build a time-based index, we will be
maintaining the largest timestamp for all messages in a log segment. We can
delete a segment if its largest timestamp is less than
log.retention.min.timestamp. This guarantees that no messages newer than
log.retention.min.timestamp will be deleted, which is probably what the
user wants.

2. Right now, the user can specify "delete" as the retention policy and a
log segment will be deleted either when the size of a partition exceeds a
threshold or the timestamp of a segment is older than a relative period of
time (say 7 days) from now. What you are proposing is not a new retention
policy, but an additional check that will cause a segment to be deleted
when the timestamp of a segment is older than an absolute timestamp? If so,
could you update the wiki accordingly?

Thanks,

Jun

On Fri, Feb 19, 2016 at 2:57 PM, Bill Warshaw <wdwars...@gmail.com> wrote:

> Hello all,
>
> What is the next step with this proposal?  The work for KIP-32 that it was
> based off merged earlier today (https://github.com/apache/kafka/pull/764,
> thank you Becket).  I have an implementation with tests, and I've confirmed
> that it actually works in a live system.  Is there more discussion that
> needs to be had about this KIP, or should I start a VOTE thread?
>
>
>
> On Tue, Feb 16, 2016 at 5:06 PM, Jun Rao <j...@confluent.io> wrote:
>
> > Bill,
> >
> > Thanks for the proposal. A couple of comments.
> >
> > 1. It seems that this new policy should work for CreateTime as well. If a
> > topic is configured with CreateTime, messages may not be added in strict
> > order in the log. However, to build a time-based index, we will be
> > maintaining the largest timestamp for all messages in a log segment. We
> can
> > delete a segment if its largest timestamp is less than
> > log.retention.min.timestamp. This guarantees that no messages newer than
> > log.retention.min.timestamp will be deleted, which is probably what the
> > user wants.
> >
> > 2. Right now, the user can specify "delete" as the retention policy and a
> > log segment will be deleted either when the size of a partition exceeds a
> > threshold or the timestamp of a segment is older than a relative period
> of
> > time (say 7 days) from now. What you are proposing is not a new retention
> > policy, but an additional check that will cause a segment to be deleted
> > when the timestamp of a segment is older than an absolute timestamp? If
> so,
> > could you update the wiki accordingly?
> >
> > Jun
> >
> >
> >
> > On Sat, Feb 13, 2016 at 3:23 PM, Bill Warshaw <wdwars...@gmail.com>
> wrote:
> >
> > > Hello,
> > >
> > > That is a good catch, thanks for pointing it out.  If this KIP is
> > accepted,
> > > we'd need to document this and make the log cleaner not run
> > timestamp-based
> > > deletion unless message.timestamp.type=LogAppendTime.
> > >
> > > On Sat, Feb 13, 2016 at 5:38 AM, Andrew Schofield <
> > > andrew_schofield_j...@outlook.com> wrote:
> > >
> > > > This KIP is related to KIP-32, but I strikes me that it only makes
> > sense
> > > > with one of the two proposed message timestamp types. If I understand
> > > > correctly, message timestamps are only certain to be monotonically
> > > > increasing in the log if message.timestamp.type=LogAppendTime.
> > > >
> > > >
> > > >
> > > > Does timestamp-based auto-expiration require use of
> > > > message.timestamp.type=LogAppendTime?
> > > >
> > > >
> > > >
> > > >
> > > > I think this KIP is a good idea, but I think it relies on strict
> > ordering
> > > > of timestamps to be workable.
> > > >
> > > >
> > > >
> > > > Andrew Schofield
> > > >
> > > >
> > > >
> > > >
> > > > > Date: Fri, 12 Feb 2016 10:38:46 -0800
> > > > > Subject: Re: [DISCUSS] KIP-47 - Add timestamp-based log deletion
> > policy
> > > > > From: n...@confluent.io
> > > > > To: dev@kafka.apache.org
> > > > >
> > > > > Adding a timestamp based auto-expiration is useful and thi

Re: [DISCUSS] KIP-47 - Add timestamp-based log deletion policy

2016-02-19 Thread Ismael Juma
Hi Bill,

It would be good to add some information on the CreateTime versus
LogAppendTime issue. Aside from that, it seems like you could start the
VOTE thread. That sometimes triggers additional discussion and that is fine.

Ismael

On Fri, Feb 19, 2016 at 10:57 PM, Bill Warshaw <wdwars...@gmail.com> wrote:

> Hello all,
>
> What is the next step with this proposal?  The work for KIP-32 that it was
> based off merged earlier today (https://github.com/apache/kafka/pull/764,
> thank you Becket).  I have an implementation with tests, and I've confirmed
> that it actually works in a live system.  Is there more discussion that
> needs to be had about this KIP, or should I start a VOTE thread?
>
>
>
> On Tue, Feb 16, 2016 at 5:06 PM, Jun Rao <j...@confluent.io> wrote:
>
> > Bill,
> >
> > Thanks for the proposal. A couple of comments.
> >
> > 1. It seems that this new policy should work for CreateTime as well. If a
> > topic is configured with CreateTime, messages may not be added in strict
> > order in the log. However, to build a time-based index, we will be
> > maintaining the largest timestamp for all messages in a log segment. We
> can
> > delete a segment if its largest timestamp is less than
> > log.retention.min.timestamp. This guarantees that no messages newer than
> > log.retention.min.timestamp will be deleted, which is probably what the
> > user wants.
> >
> > 2. Right now, the user can specify "delete" as the retention policy and a
> > log segment will be deleted either when the size of a partition exceeds a
> > threshold or the timestamp of a segment is older than a relative period
> of
> > time (say 7 days) from now. What you are proposing is not a new retention
> > policy, but an additional check that will cause a segment to be deleted
> > when the timestamp of a segment is older than an absolute timestamp? If
> so,
> > could you update the wiki accordingly?
> >
> > Jun
> >
> >
> >
> > On Sat, Feb 13, 2016 at 3:23 PM, Bill Warshaw <wdwars...@gmail.com>
> wrote:
> >
> > > Hello,
> > >
> > > That is a good catch, thanks for pointing it out.  If this KIP is
> > accepted,
> > > we'd need to document this and make the log cleaner not run
> > timestamp-based
> > > deletion unless message.timestamp.type=LogAppendTime.
> > >
> > > On Sat, Feb 13, 2016 at 5:38 AM, Andrew Schofield <
> > > andrew_schofield_j...@outlook.com> wrote:
> > >
> > > > This KIP is related to KIP-32, but I strikes me that it only makes
> > sense
> > > > with one of the two proposed message timestamp types. If I understand
> > > > correctly, message timestamps are only certain to be monotonically
> > > > increasing in the log if message.timestamp.type=LogAppendTime.
> > > >
> > > >
> > > >
> > > > Does timestamp-based auto-expiration require use of
> > > > message.timestamp.type=LogAppendTime?
> > > >
> > > >
> > > >
> > > >
> > > > I think this KIP is a good idea, but I think it relies on strict
> > ordering
> > > > of timestamps to be workable.
> > > >
> > > >
> > > >
> > > > Andrew Schofield
> > > >
> > > >
> > > >
> > > >
> > > > > Date: Fri, 12 Feb 2016 10:38:46 -0800
> > > > > Subject: Re: [DISCUSS] KIP-47 - Add timestamp-based log deletion
> > policy
> > > > > From: n...@confluent.io
> > > > > To: dev@kafka.apache.org
> > > > >
> > > > > Adding a timestamp based auto-expiration is useful and this
> proposal
> > > > makes
> > > > > sense. Thx!
> > > > >
> > > > > On Wed, Feb 10, 2016 at 3:35 PM, Jay Kreps  wrote:
> > > > >
> > > > >> I think this makes a lot of sense and won't be hard to implement
> and
> > > > >> doesn't create too much in the way of new interfaces.
> > > > >>
> > > > >> -Jay
> > > > >>
> > > > >> On Tue, Feb 9, 2016 at 8:13 AM, Bill Warshaw  wrote:
> > > > >>
> > > > >>> Hello,
> > > > >>>
> > > > >>> I just submitted KIP-47 for adding a new log deletion policy
> based
> > > on a
> > > > >>> minimum timestamp of messages to retain.
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-47+-+Add+timestamp-based+log+deletion+policy
> > > > >>>
> > > > >>> I'm open to any comments or suggestions.
> > > > >>>
> > > > >>> Thanks,
> > > > >>> Bill Warshaw
> > > > >>>
> > > > >>
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Thanks,
> > > > > Neha
> > > >
> > >
> >
>


Re: [DISCUSS] KIP-47 - Add timestamp-based log deletion policy

2016-02-19 Thread Bill Warshaw
Hello all,

What is the next step with this proposal?  The work for KIP-32 that it was
based off merged earlier today (https://github.com/apache/kafka/pull/764,
thank you Becket).  I have an implementation with tests, and I've confirmed
that it actually works in a live system.  Is there more discussion that
needs to be had about this KIP, or should I start a VOTE thread?



On Tue, Feb 16, 2016 at 5:06 PM, Jun Rao <j...@confluent.io> wrote:

> Bill,
>
> Thanks for the proposal. A couple of comments.
>
> 1. It seems that this new policy should work for CreateTime as well. If a
> topic is configured with CreateTime, messages may not be added in strict
> order in the log. However, to build a time-based index, we will be
> maintaining the largest timestamp for all messages in a log segment. We can
> delete a segment if its largest timestamp is less than
> log.retention.min.timestamp. This guarantees that no messages newer than
> log.retention.min.timestamp will be deleted, which is probably what the
> user wants.
>
> 2. Right now, the user can specify "delete" as the retention policy and a
> log segment will be deleted either when the size of a partition exceeds a
> threshold or the timestamp of a segment is older than a relative period of
> time (say 7 days) from now. What you are proposing is not a new retention
> policy, but an additional check that will cause a segment to be deleted
> when the timestamp of a segment is older than an absolute timestamp? If so,
> could you update the wiki accordingly?
>
> Jun
>
>
>
> On Sat, Feb 13, 2016 at 3:23 PM, Bill Warshaw <wdwars...@gmail.com> wrote:
>
> > Hello,
> >
> > That is a good catch, thanks for pointing it out.  If this KIP is
> accepted,
> > we'd need to document this and make the log cleaner not run
> timestamp-based
> > deletion unless message.timestamp.type=LogAppendTime.
> >
> > On Sat, Feb 13, 2016 at 5:38 AM, Andrew Schofield <
> > andrew_schofield_j...@outlook.com> wrote:
> >
> > > This KIP is related to KIP-32, but I strikes me that it only makes
> sense
> > > with one of the two proposed message timestamp types. If I understand
> > > correctly, message timestamps are only certain to be monotonically
> > > increasing in the log if message.timestamp.type=LogAppendTime.
> > >
> > >
> > >
> > > Does timestamp-based auto-expiration require use of
> > > message.timestamp.type=LogAppendTime?
> > >
> > >
> > >
> > >
> > > I think this KIP is a good idea, but I think it relies on strict
> ordering
> > > of timestamps to be workable.
> > >
> > >
> > >
> > > Andrew Schofield
> > >
> > >
> > >
> > >
> > > > Date: Fri, 12 Feb 2016 10:38:46 -0800
> > > > Subject: Re: [DISCUSS] KIP-47 - Add timestamp-based log deletion
> policy
> > > > From: n...@confluent.io
> > > > To: dev@kafka.apache.org
> > > >
> > > > Adding a timestamp based auto-expiration is useful and this proposal
> > > makes
> > > > sense. Thx!
> > > >
> > > > On Wed, Feb 10, 2016 at 3:35 PM, Jay Kreps  wrote:
> > > >
> > > >> I think this makes a lot of sense and won't be hard to implement and
> > > >> doesn't create too much in the way of new interfaces.
> > > >>
> > > >> -Jay
> > > >>
> > > >> On Tue, Feb 9, 2016 at 8:13 AM, Bill Warshaw  wrote:
> > > >>
> > > >>> Hello,
> > > >>>
> > > >>> I just submitted KIP-47 for adding a new log deletion policy based
> > on a
> > > >>> minimum timestamp of messages to retain.
> > > >>>
> > > >>>
> > > >>>
> > > >>
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-47+-+Add+timestamp-based+log+deletion+policy
> > > >>>
> > > >>> I'm open to any comments or suggestions.
> > > >>>
> > > >>> Thanks,
> > > >>> Bill Warshaw
> > > >>>
> > > >>
> > > >
> > > >
> > > >
> > > > --
> > > > Thanks,
> > > > Neha
> > >
> >
>


Re: [DISCUSS] KIP-47 - Add timestamp-based log deletion policy

2016-02-16 Thread Jun Rao
Bill,

Thanks for the proposal. A couple of comments.

1. It seems that this new policy should work for CreateTime as well. If a
topic is configured with CreateTime, messages may not be added in strict
order in the log. However, to build a time-based index, we will be
maintaining the largest timestamp for all messages in a log segment. We can
delete a segment if its largest timestamp is less than
log.retention.min.timestamp. This guarantees that no messages newer than
log.retention.min.timestamp will be deleted, which is probably what the
user wants.

2. Right now, the user can specify "delete" as the retention policy and a
log segment will be deleted either when the size of a partition exceeds a
threshold or the timestamp of a segment is older than a relative period of
time (say 7 days) from now. What you are proposing is not a new retention
policy, but an additional check that will cause a segment to be deleted
when the timestamp of a segment is older than an absolute timestamp? If so,
could you update the wiki accordingly?

Jun



On Sat, Feb 13, 2016 at 3:23 PM, Bill Warshaw <wdwars...@gmail.com> wrote:

> Hello,
>
> That is a good catch, thanks for pointing it out.  If this KIP is accepted,
> we'd need to document this and make the log cleaner not run timestamp-based
> deletion unless message.timestamp.type=LogAppendTime.
>
> On Sat, Feb 13, 2016 at 5:38 AM, Andrew Schofield <
> andrew_schofield_j...@outlook.com> wrote:
>
> > This KIP is related to KIP-32, but I strikes me that it only makes sense
> > with one of the two proposed message timestamp types. If I understand
> > correctly, message timestamps are only certain to be monotonically
> > increasing in the log if message.timestamp.type=LogAppendTime.
> >
> >
> >
> > Does timestamp-based auto-expiration require use of
> > message.timestamp.type=LogAppendTime?
> >
> >
> >
> >
> > I think this KIP is a good idea, but I think it relies on strict ordering
> > of timestamps to be workable.
> >
> >
> >
> > Andrew Schofield
> >
> >
> >
> >
> > > Date: Fri, 12 Feb 2016 10:38:46 -0800
> > > Subject: Re: [DISCUSS] KIP-47 - Add timestamp-based log deletion policy
> > > From: n...@confluent.io
> > > To: dev@kafka.apache.org
> > >
> > > Adding a timestamp based auto-expiration is useful and this proposal
> > makes
> > > sense. Thx!
> > >
> > > On Wed, Feb 10, 2016 at 3:35 PM, Jay Kreps  wrote:
> > >
> > >> I think this makes a lot of sense and won't be hard to implement and
> > >> doesn't create too much in the way of new interfaces.
> > >>
> > >> -Jay
> > >>
> > >> On Tue, Feb 9, 2016 at 8:13 AM, Bill Warshaw  wrote:
> > >>
> > >>> Hello,
> > >>>
> > >>> I just submitted KIP-47 for adding a new log deletion policy based
> on a
> > >>> minimum timestamp of messages to retain.
> > >>>
> > >>>
> > >>>
> > >>
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-47+-+Add+timestamp-based+log+deletion+policy
> > >>>
> > >>> I'm open to any comments or suggestions.
> > >>>
> > >>> Thanks,
> > >>> Bill Warshaw
> > >>>
> > >>
> > >
> > >
> > >
> > > --
> > > Thanks,
> > > Neha
> >
>


RE: [DISCUSS] KIP-47 - Add timestamp-based log deletion policy

2016-02-13 Thread Andrew Schofield
This KIP is related to KIP-32, but I strikes me that it only makes sense with 
one of the two proposed message timestamp types. If I understand correctly, 
message timestamps are only certain to be monotonically increasing in the log 
if message.timestamp.type=LogAppendTime.



Does timestamp-based auto-expiration require use of 
message.timestamp.type=LogAppendTime?




I think this KIP is a good idea, but I think it relies on strict ordering of 
timestamps to be workable.



Andrew Schofield




> Date: Fri, 12 Feb 2016 10:38:46 -0800
> Subject: Re: [DISCUSS] KIP-47 - Add timestamp-based log deletion policy
> From: n...@confluent.io
> To: dev@kafka.apache.org
> 
> Adding a timestamp based auto-expiration is useful and this proposal makes
> sense. Thx!
> 
> On Wed, Feb 10, 2016 at 3:35 PM, Jay Kreps  wrote:
> 
>> I think this makes a lot of sense and won't be hard to implement and
>> doesn't create too much in the way of new interfaces.
>>
>> -Jay
>>
>> On Tue, Feb 9, 2016 at 8:13 AM, Bill Warshaw  wrote:
>>
>>> Hello,
>>>
>>> I just submitted KIP-47 for adding a new log deletion policy based on a
>>> minimum timestamp of messages to retain.
>>>
>>>
>>>
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-47+-+Add+timestamp-based+log+deletion+policy
>>>
>>> I'm open to any comments or suggestions.
>>>
>>> Thanks,
>>> Bill Warshaw
>>>
>>
> 
> 
> 
> -- 
> Thanks,
> Neha

Re: [DISCUSS] KIP-47 - Add timestamp-based log deletion policy

2016-02-13 Thread Enrico Olivelli
Hi,
I'm currently using kafka 0.9  as a commit log. I would find more useful to
set an offset for every partition (actually I am using only one partition
per topic)  instead of a global  timestamp, has this option been already
considered?  Thanks
-- Enrico

Il giorno Sab 13 Feb 2016 11:38 Andrew Schofield <
andrew_schofield_j...@outlook.com> ha scritto:

> This KIP is related to KIP-32, but I strikes me that it only makes sense
> with one of the two proposed message timestamp types. If I understand
> correctly, message timestamps are only certain to be monotonically
> increasing in the log if message.timestamp.type=LogAppendTime.
>
>
>
> Does timestamp-based auto-expiration require use of
> message.timestamp.type=LogAppendTime?
>
>
>
>
> I think this KIP is a good idea, but I think it relies on strict ordering
> of timestamps to be workable.
>
>
>
> Andrew Schofield
>
>
>
>
> > Date: Fri, 12 Feb 2016 10:38:46 -0800
> > Subject: Re: [DISCUSS] KIP-47 - Add timestamp-based log deletion policy
> > From: n...@confluent.io
> > To: dev@kafka.apache.org
> >
> > Adding a timestamp based auto-expiration is useful and this proposal
> makes
> > sense. Thx!
> >
> > On Wed, Feb 10, 2016 at 3:35 PM, Jay Kreps  wrote:
> >
> >> I think this makes a lot of sense and won't be hard to implement and
> >> doesn't create too much in the way of new interfaces.
> >>
> >> -Jay
> >>
> >> On Tue, Feb 9, 2016 at 8:13 AM, Bill Warshaw  wrote:
> >>
> >>> Hello,
> >>>
> >>> I just submitted KIP-47 for adding a new log deletion policy based on a
> >>> minimum timestamp of messages to retain.
> >>>
> >>>
> >>>
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-47+-+Add+timestamp-based+log+deletion+policy
> >>>
> >>> I'm open to any comments or suggestions.
> >>>
> >>> Thanks,
> >>> Bill Warshaw
> >>>
> >>
> >
> >
> >
> > --
> > Thanks,
> > Neha

-- 


-- Enrico Olivelli


Re: [DISCUSS] KIP-47 - Add timestamp-based log deletion policy

2016-02-13 Thread Bill Warshaw
Hello,

That is a good catch, thanks for pointing it out.  If this KIP is accepted,
we'd need to document this and make the log cleaner not run timestamp-based
deletion unless message.timestamp.type=LogAppendTime.

On Sat, Feb 13, 2016 at 5:38 AM, Andrew Schofield <
andrew_schofield_j...@outlook.com> wrote:

> This KIP is related to KIP-32, but I strikes me that it only makes sense
> with one of the two proposed message timestamp types. If I understand
> correctly, message timestamps are only certain to be monotonically
> increasing in the log if message.timestamp.type=LogAppendTime.
>
>
>
> Does timestamp-based auto-expiration require use of
> message.timestamp.type=LogAppendTime?
>
>
>
>
> I think this KIP is a good idea, but I think it relies on strict ordering
> of timestamps to be workable.
>
>
>
> Andrew Schofield
>
>
>
>
> > Date: Fri, 12 Feb 2016 10:38:46 -0800
> > Subject: Re: [DISCUSS] KIP-47 - Add timestamp-based log deletion policy
> > From: n...@confluent.io
> > To: dev@kafka.apache.org
> >
> > Adding a timestamp based auto-expiration is useful and this proposal
> makes
> > sense. Thx!
> >
> > On Wed, Feb 10, 2016 at 3:35 PM, Jay Kreps  wrote:
> >
> >> I think this makes a lot of sense and won't be hard to implement and
> >> doesn't create too much in the way of new interfaces.
> >>
> >> -Jay
> >>
> >> On Tue, Feb 9, 2016 at 8:13 AM, Bill Warshaw  wrote:
> >>
> >>> Hello,
> >>>
> >>> I just submitted KIP-47 for adding a new log deletion policy based on a
> >>> minimum timestamp of messages to retain.
> >>>
> >>>
> >>>
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-47+-+Add+timestamp-based+log+deletion+policy
> >>>
> >>> I'm open to any comments or suggestions.
> >>>
> >>> Thanks,
> >>> Bill Warshaw
> >>>
> >>
> >
> >
> >
> > --
> > Thanks,
> > Neha
>


Re: [DISCUSS] KIP-47 - Add timestamp-based log deletion policy

2016-02-13 Thread Bill Warshaw
We initially looked at exposing an API through AdminUtils that would
manually delete everything in a given partition before a specified offset.
We ran into several difficulties in figuring out how to efficiently
communicate this "truncate partition" command to all brokers in Zookeeper,
without adding a large amount of Zookeeper watches or storing large amounts
of data in Zookeeper.

To clarify, the minimum timestamp will be a per-topic configuration.

On Sat, Feb 13, 2016 at 5:02 PM, Enrico Olivelli <eolive...@gmail.com>
wrote:

> Hi,
> I'm currently using kafka 0.9  as a commit log. I would find more useful to
> set an offset for every partition (actually I am using only one partition
> per topic)  instead of a global  timestamp, has this option been already
> considered?  Thanks
> -- Enrico
>
> Il giorno Sab 13 Feb 2016 11:38 Andrew Schofield <
> andrew_schofield_j...@outlook.com> ha scritto:
>
> > This KIP is related to KIP-32, but I strikes me that it only makes sense
> > with one of the two proposed message timestamp types. If I understand
> > correctly, message timestamps are only certain to be monotonically
> > increasing in the log if message.timestamp.type=LogAppendTime.
> >
> >
> >
> > Does timestamp-based auto-expiration require use of
> > message.timestamp.type=LogAppendTime?
> >
> >
> >
> >
> > I think this KIP is a good idea, but I think it relies on strict ordering
> > of timestamps to be workable.
> >
> >
> >
> > Andrew Schofield
> >
> >
> >
> >
> > > Date: Fri, 12 Feb 2016 10:38:46 -0800
> > > Subject: Re: [DISCUSS] KIP-47 - Add timestamp-based log deletion policy
> > > From: n...@confluent.io
> > > To: dev@kafka.apache.org
> > >
> > > Adding a timestamp based auto-expiration is useful and this proposal
> > makes
> > > sense. Thx!
> > >
> > > On Wed, Feb 10, 2016 at 3:35 PM, Jay Kreps  wrote:
> > >
> > >> I think this makes a lot of sense and won't be hard to implement and
> > >> doesn't create too much in the way of new interfaces.
> > >>
> > >> -Jay
> > >>
> > >> On Tue, Feb 9, 2016 at 8:13 AM, Bill Warshaw  wrote:
> > >>
> > >>> Hello,
> > >>>
> > >>> I just submitted KIP-47 for adding a new log deletion policy based
> on a
> > >>> minimum timestamp of messages to retain.
> > >>>
> > >>>
> > >>>
> > >>
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-47+-+Add+timestamp-based+log+deletion+policy
> > >>>
> > >>> I'm open to any comments or suggestions.
> > >>>
> > >>> Thanks,
> > >>> Bill Warshaw
> > >>>
> > >>
> > >
> > >
> > >
> > > --
> > > Thanks,
> > > Neha
>
> --
>
>
> -- Enrico Olivelli
>


Re: [DISCUSS] KIP-47 - Add timestamp-based log deletion policy

2016-02-13 Thread Ismael Juma
Hi Bill,

Thanks for the proposal, it makes sense. A minor comment:
`log.retention.mintimestamp`
should maybe be called `log.retention.min.timestamp` to be consistent with
other properties. Also, it would be good to update the KIP to mention the
LogAppendTime point.

Ismael

On Sat, Feb 13, 2016 at 11:23 PM, Bill Warshaw <wdwars...@gmail.com> wrote:

> Hello,
>
> That is a good catch, thanks for pointing it out.  If this KIP is accepted,
> we'd need to document this and make the log cleaner not run timestamp-based
> deletion unless message.timestamp.type=LogAppendTime.
>
> On Sat, Feb 13, 2016 at 5:38 AM, Andrew Schofield <
> andrew_schofield_j...@outlook.com> wrote:
>
> > This KIP is related to KIP-32, but I strikes me that it only makes sense
> > with one of the two proposed message timestamp types. If I understand
> > correctly, message timestamps are only certain to be monotonically
> > increasing in the log if message.timestamp.type=LogAppendTime.
> >
> >
> >
> > Does timestamp-based auto-expiration require use of
> > message.timestamp.type=LogAppendTime?
> >
> >
> >
> >
> > I think this KIP is a good idea, but I think it relies on strict ordering
> > of timestamps to be workable.
> >
> >
> >
> > Andrew Schofield
> >
> >
> >
> >
> > > Date: Fri, 12 Feb 2016 10:38:46 -0800
> > > Subject: Re: [DISCUSS] KIP-47 - Add timestamp-based log deletion policy
> > > From: n...@confluent.io
> > > To: dev@kafka.apache.org
> > >
> > > Adding a timestamp based auto-expiration is useful and this proposal
> > makes
> > > sense. Thx!
> > >
> > > On Wed, Feb 10, 2016 at 3:35 PM, Jay Kreps  wrote:
> > >
> > >> I think this makes a lot of sense and won't be hard to implement and
> > >> doesn't create too much in the way of new interfaces.
> > >>
> > >> -Jay
> > >>
> > >> On Tue, Feb 9, 2016 at 8:13 AM, Bill Warshaw  wrote:
> > >>
> > >>> Hello,
> > >>>
> > >>> I just submitted KIP-47 for adding a new log deletion policy based
> on a
> > >>> minimum timestamp of messages to retain.
> > >>>
> > >>>
> > >>>
> > >>
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-47+-+Add+timestamp-based+log+deletion+policy
> > >>>
> > >>> I'm open to any comments or suggestions.
> > >>>
> > >>> Thanks,
> > >>> Bill Warshaw
> > >>>
> > >>
> > >
> > >
> > >
> > > --
> > > Thanks,
> > > Neha
> >
>


Re: [DISCUSS] KIP-47 - Add timestamp-based log deletion policy

2016-02-12 Thread Neha Narkhede
Adding a timestamp based auto-expiration is useful and this proposal makes
sense. Thx!

On Wed, Feb 10, 2016 at 3:35 PM, Jay Kreps  wrote:

> I think this makes a lot of sense and won't be hard to implement and
> doesn't create too much in the way of new interfaces.
>
> -Jay
>
> On Tue, Feb 9, 2016 at 8:13 AM, Bill Warshaw  wrote:
>
> > Hello,
> >
> > I just submitted KIP-47 for adding a new log deletion policy based on a
> > minimum timestamp of messages to retain.
> >
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-47+-+Add+timestamp-based+log+deletion+policy
> >
> > I'm open to any comments or suggestions.
> >
> > Thanks,
> > Bill Warshaw
> >
>



-- 
Thanks,
Neha


Re: [DISCUSS] KIP-47 - Add timestamp-based log deletion policy

2016-02-10 Thread Jay Kreps
I think this makes a lot of sense and won't be hard to implement and
doesn't create too much in the way of new interfaces.

-Jay

On Tue, Feb 9, 2016 at 8:13 AM, Bill Warshaw  wrote:

> Hello,
>
> I just submitted KIP-47 for adding a new log deletion policy based on a
> minimum timestamp of messages to retain.
>
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-47+-+Add+timestamp-based+log+deletion+policy
>
> I'm open to any comments or suggestions.
>
> Thanks,
> Bill Warshaw
>


[DISCUSS] KIP-47 - Add timestamp-based log deletion policy

2016-02-09 Thread Bill Warshaw
Hello,

I just submitted KIP-47 for adding a new log deletion policy based on a
minimum timestamp of messages to retain.

https://cwiki.apache.org/confluence/display/KAFKA/KIP-47+-+Add+timestamp-based+log+deletion+policy

I'm open to any comments or suggestions.

Thanks,
Bill Warshaw