Hi Xiang, Thanks for your question! That sentence is a justification for why the partitionLeaderEpoch field is not included in the CRC.
If you mutate fields which are included in a CRC, you need to recompute the CRC value. See [1] for mutating the maxTimestamp. Compare that with [2] for setting the partitionLeaderEpoch. This makes setting the partitionLeaderEpoch faster than setting the max timestamp. And because setting the partitionLeaderEpoch happens on every Produce request, it was optimized in the protocol design. It does have the tradeoff that corruptions in the partitionLeaderEpoch are not detected by the CRC, but someone decided this was worth the optimization to the Produce flow. I don't have more information on why this optimization was made for partitionLeaderEpoch and not maxTimestamp. Hope this helps, Greg [1] https://github.com/apache/kafka/blob/2d896d9130f121e75ccba2d913bdffa358cf3867/clients/src/main/java/org/apache/kafka/common/record/DefaultRecordBatch.java#L371-L382 [2] https://github.com/apache/kafka/blob/2d896d9130f121e75ccba2d913bdffa358cf3867/clients/src/main/java/org/apache/kafka/common/record/DefaultRecordBatch.java#L385-L387 On Tue, Oct 22, 2024 at 7:51 PM Xiang Zhang <xiangzhang1...@gmail.com> wrote: > Hi all, > > I am reading official doc here: > https://kafka.apache.org/documentation/#messageformat, and I could not > fully understand it. If someone can clarify it for me, it would be much > appreciated. The sentence is > > The partition leader epoch field is not included in the CRC computation to > avoid the need to recompute the CRC when this field is assigned for every > batch that is received by the broker. > > I just don’t really get what the highlight part is trying to say. > > Regards, > Xiang Zhang >