Travis Bischel created KAFKA-15656:
--------------------------------------

             Summary: Frequent INVALID_RECORD on Kafka 3.6
                 Key: KAFKA-15656
                 URL: https://issues.apache.org/jira/browse/KAFKA-15656
             Project: Kafka
          Issue Type: Bug
          Components: producer 
    Affects Versions: 3.6.0
            Reporter: Travis Bischel
         Attachments: invalid_record.log

Using this docker-compose.yml:
{noformat}
version: "3.7"
services:
  kafka:
    image: bitnami/kafka:latest
    network_mode: host
    environment:
      KAFKA_ENABLE_KRAFT: yes
      KAFKA_CFG_PROCESS_ROLES: controller,broker
      KAFKA_CFG_CONTROLLER_LISTENER_NAMES: CONTROLLER
      KAFKA_CFG_LISTENERS: PLAINTEXT://:9092,CONTROLLER://:9093
      KAFKA_CFG_LISTENER_SECURITY_PROTOCOL_MAP: 
CONTROLLER:PLAINTEXT,PLAINTEXT:PLAINTEXT
      KAFKA_CFG_CONTROLLER_QUORUM_VOTERS: 1@127.0.0.1:9093
      # Set this to "PLAINTEXT://127.0.0.1:9092" if you want to run this 
container on localhost via Docker
      KAFKA_CFG_ADVERTISED_LISTENERS: PLAINTEXT://localhost:9092
      KAFKA_CFG_NODE_ID: 1
      ALLOW_PLAINTEXT_LISTENER: yes
      KAFKA_KRAFT_CLUSTER_ID: XkpGZQ27R3eTl3OdTm2LYA # 16 byte base64-encoded 
UUID{noformat}
And running franz-go integration tests with KGO_TEST_RF=1, I consistently 
receive INVALID_RECORD errors.

 

Looking at the container logs, I see these problematic log lines:
{noformat}
2023-10-19 23:33:47,942] ERROR [ReplicaManager broker=1] Error processing 
append operation on partition 
0cf2f3faaafd3f906ea848b684b04833ca162bcd19ecae2cab36767a54f248c7-0 
(kafka.server.ReplicaManager) 
org.apache.kafka.common.InvalidRecordException: Invalid negative header key 
size -25
[2023-10-19 23:33:47,942] ERROR [ReplicaManager broker=1] Error processing 
append operation on partition 
0cf2f3faaafd3f906ea848b684b04833ca162bcd19ecae2cab36767a54f248c7-6 
(kafka.server.ReplicaManager) 
org.apache.kafka.common.InvalidRecordException: Reached end of input stream 
before skipping all bytes. Remaining bytes:94
[2023-10-19 23:33:47,942] ERROR [ReplicaManager broker=1] Error processing 
append operation on partition 
0cf2f3faaafd3f906ea848b684b04833ca162bcd19ecae2cab36767a54f248c7-1 
(kafka.server.ReplicaManager) 
org.apache.kafka.common.InvalidRecordException: Found invalid number of record 
headers -26
[2023-10-19 23:33:47,948] ERROR [ReplicaManager broker=1] Error processing 
append operation on partition 
0cf2f3faaafd3f906ea848b684b04833ca162bcd19ecae2cab36767a54f248c7-6 
(kafka.server.ReplicaManager) 
org.apache.kafka.common.InvalidRecordException: Found invalid number of record 
headers -27
[2023-10-19 23:33:47,950] ERROR [ReplicaManager broker=1] Error processing 
append operation on partition 
0cf2f3faaafd3f906ea848b684b04833ca162bcd19ecae2cab36767a54f248c7-22 
(kafka.server.ReplicaManager)
org.apache.kafka.common.InvalidRecordException: Invalid negative header key 
size -25
[2023-10-19 23:33:47,947] ERROR [ReplicaManager broker=1] Error processing 
append operation on partition 
c63b6e30987317fad18815effb8d432b6df677d2ab56cf6da517bb93fa49b74b-25 
(kafka.server.ReplicaManager)
org.apache.kafka.common.InvalidRecordException: Found invalid number of record 
headers -50
[2023-10-19 23:33:47,959] ERROR [ReplicaManager broker=1] Error processing 
append operation on partition 
c63b6e30987317fad18815effb8d432b6df677d2ab56cf6da517bb93fa49b74b-25 
(kafka.server.ReplicaManager) 
 {noformat}
 

I modified franz-go with a diff to print the request that was written to the 
wire once this error occurs. Attached is a v9 produce request. I deserialized 
it locally and am not seeing the corrupt data that Kafka is printing. It's 
possible there is a bug in the client, but again, these tests have never 
received this error pre-Kafka 3.6. It _looks like_ there is either corruption 
when processing the incoming data, or there is some problematic race condition 
in the broker - I'm not sure which.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to