Inder, >> 2. Why would you want to have multiple files within a partition. Broker has >> to store more info to figure the right file among a partition.
There is not much advantage apart from better accuracy with the getLatestOffeset API. Using that if you want to start consuming data close to a certain timestamp, you get better accuracy if you have smaller log files. >> 3. Is it to achieve mmap kinda optimization and allowing the broker to do >> less I/O in case a feed is really huge or any thing else. Not really. mmap is useful when you have random access on large files, or have multiple process trying to access the same file. It might actually not work well with large files if your memory is fragmented. Since we have sequential IO patterns, the filesystem caching itself works very well. Thanks, Neha On Tuesday, October 25, 2011, Jay Kreps <jay.kr...@gmail.com> wrote: > It is actually just to allow data deletion, we just delete whole segments in > the cleanup. There is not much value to tuning the file size for most > situations, but the tradeoff is that with smaller files you will have more > open files but be closer to your desired retention.hours and retention.size > settings. > > -Jay > > On Tue, Oct 25, 2011 at 1:59 AM, Inder Pall <inder.p...@gmail.com> wrote: > >> i am playing around with "log.file.size"(controls the size of a segment >> file >> in a partition) and "log.retention.hours" with the following config. >> log.file.size=500 >> log.retention.hours=168 >> >> Observation - i see multiple files getting generated within the same >> partition. >> Example : my topic name is revenue feed and i see the following >> >> ls -lh /tmp/kafka-logs/revenuefeed-0/* >> -rw-r--r-- 1 inder users 537 Oct 25 01:38 >> /tmp/kafka-logs/revenuefeed-0/00000000000000000000.kafka >> -rw-r--r-- 1 inder users 512 Oct 25 01:39 >> /tmp/kafka-logs/revenuefeed-0/00000000000000000537.kafka >> >> Questions >> -------------- >> 1. Shouldn't these two properties go hand in hand >> 2. Why would you want to have multiple files within a partition. Broker has >> to store more info to figure the right file among a partition. >> 3. Is it to achieve mmap kinda optimization and allowing the broker to do >> less I/O in case a feed is really huge or any thing else. >> >> -- Inder >> >