mjsax commented on code in PR #14322:
URL: https://github.com/apache/kafka/pull/14322#discussion_r1317996699
##########
docs/design.html:
##########
@@ -136,8 +136,10 @@ <h4 class="anchor-heading"><a id="design_compression"
class="anchor-link"></a><a
the user can always compress its messages one at a time without any
support needed from Kafka, but this can lead to very poor compression ratios as
much of the redundancy is due to repetition between messages of
the same type (e.g. field names in JSON or user agents in web logs or
common string values). Efficient compression requires compressing multiple
messages together rather than compressing each message individually.
<p>
- Kafka supports this with an efficient batching format. A batch of messages
can be clumped together compressed and sent to the server in this form. This
batch of messages will be written in compressed form and will
- remain compressed in the log and will only be decompressed by the consumer.
+ Kafka supports this with an efficient batching format. A batch of messages
can be grouped together, compressed, and sent to the server in this form. The
broker decompresses the batch in order to validate it. For
+ example, it validates that the number of records in the batch is same as
what batch header states. The broker may also potentially modify the batch
(e.g., if the topic is compacted, the broker will filter out
Review Comment:
Yeah, the sentence sounds as if the broker would perform a compaction, what
from my understanding won't be the case -- my understanding is, that the broker
would never _modify_ a batch (it might re-compress it with a different
compression-format though, depending on broker/topic configs).
For compacted topics and null-keys, the batch would be rejected with an
error message back to the producer.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]