Re: undesirable log retention behavior

Steven Wu Thu, 31 Jul 2014 21:57:07 -0700

log.retention.bytes can somewhat help. but it is cumbersome to use because
it is a per-topic config for partition limit.


there was an earlier thread regarding global bytes limit. that will work
well for my purpose of avoiding disk full.
https://issues.apache.org/jira/browse/KAFKA-1489


On Thu, Jul 31, 2014 at 7:39 PM, Joe Stein <joe.st...@stealth.ly> wrote:

> What version of Kafka are your using? Have you tried log.retention.bytes?
> Which ever comes first (ttl or bytes total) should do what you are looking
> for if I understand you right.
> http://kafka.apache.org/documentation.html#brokerconfigs
>
> /*******************************************
> Joe Stein
> Founder, Principal Consultant
> Big Data Open Source Security LLC
> http://www.stealth.ly
> Twitter: @allthingshadoop
> ********************************************/
> On Jul 31, 2014 6:52 PM, "Steven Wu" <steve...@netflix.com.invalid> wrote:
>
> > it seems that log retention is purely based on last touch/modified
> > timestamp. This is undesirable for code push in aws/cloud.
> >
> > e.g. let's say retention window is 24 hours. disk size is 1 TB. disk util
> > is 60% (600GB). when new instance comes up, it will fetch log files
> (600GB)
> > from peers. those log files all have newer timestamps. they won't be
> purged
> > until 24 hours later. note that during the first 24 hours, new msgs
> > (another 600GB) continue to come in. This can cause disk full problem
> > without any intervention. With this behavior, we have to keep disk util
> > under 50%.
> >
> > can last modified timestamp be inserted into the file name when rolling
> > over log files? then kafka can check the file name for timestamp. does
> this
> > make sense?
> >
> > Thanks,
> > Steven
> >
>

Re: undesirable log retention behavior

Reply via email to