> On Aug. 10, 2014, 11:44 p.m., Jun Rao wrote: > > core/src/main/scala/kafka/log/LogCleaner.scala, lines 400-420 > > <https://reviews.apache.org/r/24214/diff/4/?file=657033#file657033line400> > > > > Thinking about this a bit more. I am wondering if it would be better if > > we introduce a per-topic level log.compact.compress.codec property. During > > log compaction, we always write the retained data using the specified > > compress codec, independent of whether the original records are compressed > > or not. This provides the following benefits. > > > > 1. Whether the messages were compressed originally, they can be > > compressed on the broker side over time. Since compact topics preserve > > records much longer, enabling compression on the broker side will be > > beneficial in general. > > > > 2. As old records are removed, we still want to batch enough messages > > to do the compression. > > > > 3. The code can be a bit simpler. We can just (deep) iterate messages > > (using MemoryRecods.iterator) and append retained messages to an output > > MemoryRecords. The output MemoryRecords will be initialized with the > > configured compress codec and batch size.
What you proposed is similar to KAFKA-1499. KAFKA-1499 deals with default broker-side compression configuration. I proposed new configuration properties on KAFKA-1499. The idea is to compress the data upon reaching the server. This is applicable all topics (log compaction and retention). Can you comment on KAFKA-1499? - Manikumar Reddy ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/24214/#review50128 ----------------------------------------------------------- On Aug. 9, 2014, 10:51 a.m., Manikumar Reddy O wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/24214/ > ----------------------------------------------------------- > > (Updated Aug. 9, 2014, 10:51 a.m.) > > > Review request for kafka. > > > Bugs: KAFKA-1374 > https://issues.apache.org/jira/browse/KAFKA-1374 > > > Repository: kafka > > > Description > ------- > > Addressed Jun's comments;Added few changes in LogCleaner stats for compressed > messages > > > Diffs > ----- > > core/src/main/scala/kafka/log/LogCleaner.scala > c20de4ad4734c0bd83c5954fdb29464a27b91dff > core/src/test/scala/unit/kafka/log/LogCleanerIntegrationTest.scala > 5bfa764638e92f217d0ff7108ec8f53193c22978 > > Diff: https://reviews.apache.org/r/24214/diff/ > > > Testing > ------- > > > Thanks, > > Manikumar Reddy O > >