That's great thank you I have it working One other thing I noticed; if I send a batch of data then wait then compaction never happens. If I send a few more messages later then the first batch gets compacted. I guess it needs a constant flow to trigger compaction of completed segments. So it shows that my test doesn't match real life. 😃
On Fri, 6 Jan 2017 at 21:36, Ewen Cheslack-Postava <[email protected]> wrote: > On Fri, Jan 6, 2017 at 3:57 AM, Mike Gould <[email protected]> wrote: > > > > > Hi > > > > > > I'm trying to configure log compaction + deletion as per KIP-71 in kafka > > > 0.10.1 but so far haven't had any luck. My tests show more than 50% > > > duplicate keys when reading from the beginning even several minutes after > > > all the events were sent. > > > The documentation in section 3.1 doesn't seem very clear to me in terms > of > > > exactly how to configure particular behavior. Could someone please > clarify > > > a few things for me? > > > > > > In order to significantly reduce the amount of data that new subscribers > > > have to receive I want to compact events as soon as possible, and delete > > > any events more than 24 hours old (e.g if there hasn't been an update > with > > > a matching key for 24h). > > > > > > I have set > > > > > > cleanup.policy=compact, delete > > > min.cleanable.dirty.ratio=0.5 > > > min.compaction.lag.ms=0 > > > retention.ms=86400000 > > > delete.retention.ms=86460000 > > > segment.ms=60000 > > > > > > > > > - Should the cleanup.policy be "compact,delete" or "compact, delete" > or > > > something else? > > > > > > > Either should work, extra leading and trailing spaces are removed. > > > > > > > - Are events eligible for compaction soon after the > > > min.compaction.lag.ms > > > time and segment.ms or is there another parameter that affects this? > > > > I.e. if I read from the beginning after a couple of minutes should I see > > > no > > > more than 50% of the events received have the same key as previous > > > events. > > > > > > > Maybe you need to modify log.retention.check.interval.ms? It defaults to 5 > > minutes. The log cleaning runs periodically, so you may just not have > > waited log enough for cleaning to have executed. > > > > > > > - Does the retention.ms parameter only affect the deletion? > > > - How can I tell if the config is accepted and compaction is working? > Is > > > there something useful to search for in the logs? > > > > > > > Check for logs from LogCleaner.scala. It should log some info when it runs. > > > > > > > - Also if I change the topic config via the kafka-configs.sh tool does > > > the change take effect immediately for existing events, do I have to > > > restart the brokers, or does it only affect new events? > > > > > > > Topic config changes shouldn't need a broker restart. > > > > -Ewen > > > > > > > > > > Thank you > > > Mike G > > > > >
