[
https://issues.apache.org/jira/browse/KAFKA-2213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14558017#comment-14558017
]
Manikumar Reddy commented on KAFKA-2213:
----------------------------------------
CASE A: In normal scenarios, log are already written with broker/topic
compression type. During compaction, we are just compacting with same
compression type.
CASE B: In some scenarios, we may change the compression type of existing
topics using per-topic compression config. In this case, logs may have messages
with different compression types. So we want to handle this during compaction?
Currently during compaction, we are decompressing a message, and writing back
compacted message(less size). So we are maintaining producer side batching
(with less size).
In CASE B, we have to change the compression type(non-compressed messages ->
compressed, compressed -> non-compressed, compressed -> compressed). AFAIK,
batching on producer side controlled by batch.size( in bytes) config. Do we
need to introduce similar server side parameter in bytes/no. of messages.
> Log cleaner should write compacted messages using configured compression type
> -----------------------------------------------------------------------------
>
> Key: KAFKA-2213
> URL: https://issues.apache.org/jira/browse/KAFKA-2213
> Project: Kafka
> Issue Type: Bug
> Reporter: Joel Koshy
>
> In KAFKA-1374 the log cleaner was improved to handle compressed messages.
> There were a couple of follow-ups from that:
> * We write compacted messages using the original compression type in the
> compressed message-set. We should instead append all retained messages with
> the configured broker compression type of the topic.
> * While compressing messages we should ideally do some batching before
> compression.
> * Investigate the use of the client compressor. (See the discussion in the
> RBs for KAFKA-1374)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)