It is actually just to allow data deletion, we just delete whole segments in the cleanup. There is not much value to tuning the file size for most situations, but the tradeoff is that with smaller files you will have more open files but be closer to your desired retention.hours and retention.size settings.
-Jay On Tue, Oct 25, 2011 at 1:59 AM, Inder Pall <inder.p...@gmail.com> wrote: > i am playing around with "log.file.size"(controls the size of a segment > file > in a partition) and "log.retention.hours" with the following config. > log.file.size=500 > log.retention.hours=168 > > Observation - i see multiple files getting generated within the same > partition. > Example : my topic name is revenue feed and i see the following > > ls -lh /tmp/kafka-logs/revenuefeed-0/* > -rw-r--r-- 1 inder users 537 Oct 25 01:38 > /tmp/kafka-logs/revenuefeed-0/00000000000000000000.kafka > -rw-r--r-- 1 inder users 512 Oct 25 01:39 > /tmp/kafka-logs/revenuefeed-0/00000000000000000537.kafka > > Questions > -------------- > 1. Shouldn't these two properties go hand in hand > 2. Why would you want to have multiple files within a partition. Broker has > to store more info to figure the right file among a partition. > 3. Is it to achieve mmap kinda optimization and allowing the broker to do > less I/O in case a feed is really huge or any thing else. > > -- Inder >