Re: [PR] KAFKA-18723; Better handle invalid records during replication [kafka]

via GitHub Wed, 19 Feb 2025 10:52:19 -0800


jsancio commented on code in PR #18852:
URL: https://github.com/apache/kafka/pull/18852#discussion_r1960729769



##########
core/src/main/scala/kafka/log/UnifiedLog.scala:
##########
@@ -1159,6 +1177,25 @@ class UnifiedLog(@volatile var logStartOffset: Long,
       validBytesCount, lastOffsetOfFirstBatch, 
Collections.emptyList[RecordError], LeaderHwChange.NONE)
   }
 
+  /**
+   * Return true if the record batch should not be appending to the log.

Review Comment:
   Fix.



##########
core/src/test/scala/kafka/raft/KafkaMetadataLogTest.scala:
##########
@@ -108,12 +116,57 @@ final class KafkaMetadataLogTest {
       classOf[RuntimeException],
       () => {
         log.appendAsFollower(
-          MemoryRecords.withRecords(initialOffset, Compression.NONE, 
currentEpoch, recordFoo)
+          MemoryRecords.withRecords(initialOffset, Compression.NONE, 
currentEpoch, recordFoo),
+          currentEpoch
         )
       }
     )
   }
 
+  @Test
+  def testEmptyAppendNotAllowed(): Unit = {
+    val log = buildMetadataLog(tempDir, mockTime)
+
+    assertThrows(classOf[IllegalArgumentException], () => 
log.appendAsFollower(MemoryRecords.EMPTY, 1));
+    assertThrows(classOf[IllegalArgumentException], () => 
log.appendAsLeader(MemoryRecords.EMPTY, 1));
+  }
+
+  @ParameterizedTest
+  @ArgumentsSource(classOf[InvalidMemoryRecordsProvider])

Review Comment:
   Yes. You are correct, we need to test the case when the leader epoch is 
invalid. I added tests for that case to `MockLogTest`, `KafkaMetadataLogTest` 
and `UnifiedLogTest`



##########
core/src/main/scala/kafka/server/AbstractFetcherThread.scala:
##########
@@ -333,7 +336,9 @@ abstract class AbstractFetcherThread(name: String,
             // In this case, we only want to process the fetch response if the 
partition state is ready for fetch and
             // the current offset is the same as the offset requested.
             val fetchPartitionData = sessionPartitions.get(topicPartition)
-            if (fetchPartitionData != null && fetchPartitionData.fetchOffset 
== currentFetchState.fetchOffset && currentFetchState.isReadyForFetch) {
+            if (fetchPartitionData != null &&

Review Comment:
   Yeah. I thought about this when I was implementing the PR. I think we have 
two options:
   
   1. Always append up to the `currentLeaderEpoch`, the FETCH request's 
`currentLeaderEpoch` if the request version supports it or the locally recorded 
`currentLeaderEpoch` if the FETCH request version doesn't support the 
`currentLeaderEpoch` field. This is what this PR implements.
   2. Only append records up to the `currentLeaderEpoch` if the local replica's 
`currentLeaderEpoch` still matches the leader epoch when the FETCH request was 
created and sent.
   
   I think they are both correct.  Option 1 accepts and handles a superset of 
the FETCH responses that option 2 can handle. I figured that if they are both 
correct, it is better to progress faster and with less FETCH RPCs. What do you 
think?



##########
core/src/main/scala/kafka/log/UnifiedLog.scala:
##########
@@ -1159,6 +1177,25 @@ class UnifiedLog(@volatile var logStartOffset: Long,
       validBytesCount, lastOffsetOfFirstBatch, 
Collections.emptyList[RecordError], LeaderHwChange.NONE)
   }
 
+  /**
+   * Return true if the record batch should not be appending to the log.
+   *
+   * @param batch the batch to validate
+   * @param origin the reason for appending the record batch
+   * @param leaderEpoch the epoch to compare
+   * @return true if the append reason is replication and the partition leader 
epoch is greater
+   *         than the leader epoch, otherwise false

Review Comment:
   Fair. I improved the wording.



##########
raft/src/test/java/org/apache/kafka/raft/InvalidMemoryRecordsProvider.java:
##########
@@ -0,0 +1,143 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.kafka.raft;
+
+import org.apache.kafka.common.errors.CorruptRecordException;
+import org.apache.kafka.common.record.LegacyRecord;
+import org.apache.kafka.common.record.MemoryRecords;
+import org.apache.kafka.common.record.RecordBatch;
+import org.apache.kafka.common.record.Records;
+
+import org.junit.jupiter.api.extension.ExtensionContext;
+import org.junit.jupiter.params.provider.Arguments;
+import org.junit.jupiter.params.provider.ArgumentsProvider;
+
+import java.nio.ByteBuffer;
+import java.util.Random;
+import java.util.stream.Stream;
+
+public final class InvalidMemoryRecordsProvider implements ArgumentsProvider {
+    // Use a baseOffset that not zero so that is less likely to match the LEO
+    private static final long BASE_OFFSET = 1234;
+    public static final int EPOCH = 4321;
+
+    // TODO: use jqwik support for random generators
+    public static MemoryRecords buildRandomRecords(Random random) {
+        int size = random.nextInt(255) + 1;
+        byte[] bytes = new byte[size];
+        random.nextBytes(bytes);
+
+        return MemoryRecords.readableRecords(ByteBuffer.wrap(bytes));
+    }
+
+    /** Returns a stream of arguements for invalid memory records and the 
expected exception.
+     *
+     * The first object in the Arguments is a MemoryRecords.
+     *
+     * The second object in the Arguments is an Class<Exception> which is the 
expected exception from the log layer
+     */
+    @Override
+    public Stream<? extends Arguments> provideArguments(ExtensionContext 
context) {
+        return Stream.of(
+            Arguments.of(MemoryRecords.readableRecords(notEnoughtBytes()), 
CorruptRecordException.class),
+            Arguments.of(MemoryRecords.readableRecords(recordsSizeTooSmall()), 
CorruptRecordException.class),
+            
Arguments.of(MemoryRecords.readableRecords(notEnoughBytesToMagic()), 
CorruptRecordException.class),
+            Arguments.of(MemoryRecords.readableRecords(negativeMagic()), 
CorruptRecordException.class),
+            Arguments.of(MemoryRecords.readableRecords(largeMagic()), 
CorruptRecordException.class),
+            
Arguments.of(MemoryRecords.readableRecords(lessBytesThanRecordSize()), 
CorruptRecordException.class)
+        );
+    }
+
+    private static ByteBuffer notEnoughtBytes() {
+        var buffer = ByteBuffer.allocate(Records.LOG_OVERHEAD - 1);
+        buffer.limit(buffer.capacity());
+
+        return buffer;
+    }
+
+    private static ByteBuffer recordsSizeTooSmall() {
+        var buffer = ByteBuffer.allocate(256);
+        // Write the base offset
+        buffer.putLong(BASE_OFFSET);
+        // Write record size
+        buffer.putInt(LegacyRecord.RECORD_OVERHEAD_V0 - 1);
+        buffer.position(0);
+        buffer.limit(buffer.capacity());
+
+        return buffer;
+    }
+
+    private static ByteBuffer notEnoughBytesToMagic() {
+        var buffer = ByteBuffer.allocate(256);
+        // Write the base offset
+        buffer.putLong(BASE_OFFSET);
+        // Write record size
+        buffer.putInt(buffer.capacity() - Records.LOG_OVERHEAD);
+        buffer.position(0);
+        buffer.limit(Records.HEADER_SIZE_UP_TO_MAGIC - 1);
+
+        return buffer;
+    }
+
+    private static ByteBuffer negativeMagic() {
+        var buffer = ByteBuffer.allocate(256);
+        // Write the base offset
+        buffer.putLong(BASE_OFFSET);
+        // Write record size
+        buffer.putInt(buffer.capacity() - Records.LOG_OVERHEAD);
+        // Write the epoch
+        buffer.putInt(EPOCH);
+        // Write magic
+        buffer.put((byte) -1);
+        buffer.position(0);
+        buffer.limit(buffer.capacity());
+
+        return buffer;
+    }
+
+    private static ByteBuffer largeMagic() {

Review Comment:
   Large magic is one incorrect magic. Another incorrect magic is a negative 
number. Hence why I added `negativeMagic` and `largeMagic`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] KAFKA-18723; Better handle invalid records during replication [kafka]

Reply via email to