chia7712 commented on code in PR #15621:
URL: https://github.com/apache/kafka/pull/15621#discussion_r1593241011


##########
core/src/test/scala/unit/kafka/log/LogValidatorTest.scala:
##########
@@ -404,9 +414,7 @@ class LogValidatorTest {
     assertEquals(now + 1, validatingResults.maxTimestampMs,
       s"Max timestamp should be ${now + 1}")
 
-    val expectedOffsetOfMaxTimestamp = 1
-    assertEquals(expectedOffsetOfMaxTimestamp, 
validatingResults.offsetOfMaxTimestampMs,
-      s"Offset of max timestamp should be 1")
+    assertEquals(2, validatingResults.shallowOffsetOfMaxTimestamp)

Review Comment:
   yep, that is a "TYPO" but it does not change the test. We do pass the "NONE" 
to create `LogValidator` so it will run the path `assignOffsetsNonCompressed` 
   
https://github.com/apache/kafka/blob/525b9b1d7682ae2a527ceca83fedca44b1cba11a/core/src/test/scala/unit/kafka/log/LogValidatorTest.scala#L377
   
   However, I do observe a potential bug. 
   
   **context**
   1. Those batches can have different compression
   2. We take the compression from last batch
   
https://github.com/apache/kafka/blob/525b9b1d7682ae2a527ceca83fedca44b1cba11a/core/src/main/scala/kafka/log/UnifiedLog.scala#L1180
   
   **potential bug**
   
   topic-level compression = GZIP
   batch_0 = NONE
   batch_1 = GZIP
   
   In this case, we don't rebuild records according to topic-level compression 
since the compression of "last batch" is equal to `GZIP`. Hence, it results in 
batch_0 having incorrect compression.
   
   This bug does not produce corrupt records, so we can add comments/docs to 
describe that issue. Or we can fix it by changing the `sourceCompression` to be 
a "collection" of all batches' compression, and then do conversion if one of 
them is mismatched.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to