hachikuji commented on a change in pull request #9590: URL: https://github.com/apache/kafka/pull/9590#discussion_r621734730
########## File path: core/src/test/scala/unit/kafka/log/LogCleanerTest.scala ########## @@ -984,19 +1003,26 @@ class LogCleanerTest { def distinctValuesBySegment = log.logSegments.map(s => s.log.records.asScala.map(record => TestUtils.readString(record.value)).toSet.size).toSeq - val disctinctValuesBySegmentBeforeClean = distinctValuesBySegment + val distinctValuesBySegmentBeforeClean = distinctValuesBySegment assertTrue(distinctValuesBySegment.reverse.tail.forall(_ > N), "Test is not effective unless each segment contains duplicates. Increase segment size or decrease number of keys.") + log.updateHighWatermark(log.activeSegment.baseOffset) cleaner.clean(LogToClean(new TopicPartition("test", 0), log, 0, firstUncleanableOffset)) val distinctValuesBySegmentAfterClean = distinctValuesBySegment - assertTrue(disctinctValuesBySegmentBeforeClean.zip(distinctValuesBySegmentAfterClean) - .take(numCleanableSegments).forall { case (before, after) => after < before }, + // One segment should have been completely deleted, so there will be fewer segments. + assertTrue(distinctValuesBySegmentAfterClean.size < distinctValuesBySegmentBeforeClean.size) + + // Drop the first segment from before cleaning since it was removed. Also subtract 1 from numCleanableSegments + val normalizedDistinctValuesBySegmentBeforeClean = distinctValuesBySegmentBeforeClean.drop(1) Review comment: The logic in this test case has become rather obscure after the change. Maybe we could do something simpler than comparing segment by segment. As far as I can tell, all the test is doing is ensuring that the first uncleanable offset is respected. Maybe a simpler test would just write the same key over and over and then assert that all records below the uncleanable offset are removed and all records above that offset are retained? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org