[ https://issues.apache.org/jira/browse/KAFKA-15656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Travis Bischel resolved KAFKA-15656. ------------------------------------ Resolution: Duplicate Closing this as a different manifestation of (and thus, duplicate of) KAFKA-15653 > Frequent INVALID_RECORD on Kafka 3.6 > ------------------------------------ > > Key: KAFKA-15656 > URL: https://issues.apache.org/jira/browse/KAFKA-15656 > Project: Kafka > Issue Type: Bug > Components: producer > Affects Versions: 3.6.0 > Reporter: Travis Bischel > Priority: Major > Attachments: invalid_record.log > > > Using this docker-compose.yml: > {noformat} > version: "3.7" > services: > kafka: > image: bitnami/kafka:latest > network_mode: host > environment: > KAFKA_ENABLE_KRAFT: yes > KAFKA_CFG_PROCESS_ROLES: controller,broker > KAFKA_CFG_CONTROLLER_LISTENER_NAMES: CONTROLLER > KAFKA_CFG_LISTENERS: PLAINTEXT://:9092,CONTROLLER://:9093 > KAFKA_CFG_LISTENER_SECURITY_PROTOCOL_MAP: > CONTROLLER:PLAINTEXT,PLAINTEXT:PLAINTEXT > KAFKA_CFG_CONTROLLER_QUORUM_VOTERS: 1@127.0.0.1:9093 > # Set this to "PLAINTEXT://127.0.0.1:9092" if you want to run this > container on localhost via Docker > KAFKA_CFG_ADVERTISED_LISTENERS: PLAINTEXT://localhost:9092 > KAFKA_CFG_NODE_ID: 1 > ALLOW_PLAINTEXT_LISTENER: yes > KAFKA_KRAFT_CLUSTER_ID: XkpGZQ27R3eTl3OdTm2LYA # 16 byte base64-encoded > UUID{noformat} > And running franz-go integration tests with KGO_TEST_RF=1, I consistently > receive INVALID_RECORD errors. > > Looking at the container logs, I see these problematic log lines: > {noformat} > 2023-10-19 23:33:47,942] ERROR [ReplicaManager broker=1] Error processing > append operation on partition > 0cf2f3faaafd3f906ea848b684b04833ca162bcd19ecae2cab36767a54f248c7-0 > (kafka.server.ReplicaManager) > org.apache.kafka.common.InvalidRecordException: Invalid negative header key > size -25 > [2023-10-19 23:33:47,942] ERROR [ReplicaManager broker=1] Error processing > append operation on partition > 0cf2f3faaafd3f906ea848b684b04833ca162bcd19ecae2cab36767a54f248c7-6 > (kafka.server.ReplicaManager) > org.apache.kafka.common.InvalidRecordException: Reached end of input stream > before skipping all bytes. Remaining bytes:94 > [2023-10-19 23:33:47,942] ERROR [ReplicaManager broker=1] Error processing > append operation on partition > 0cf2f3faaafd3f906ea848b684b04833ca162bcd19ecae2cab36767a54f248c7-1 > (kafka.server.ReplicaManager) > org.apache.kafka.common.InvalidRecordException: Found invalid number of > record headers -26 > [2023-10-19 23:33:47,948] ERROR [ReplicaManager broker=1] Error processing > append operation on partition > 0cf2f3faaafd3f906ea848b684b04833ca162bcd19ecae2cab36767a54f248c7-6 > (kafka.server.ReplicaManager) > org.apache.kafka.common.InvalidRecordException: Found invalid number of > record headers -27 > [2023-10-19 23:33:47,950] ERROR [ReplicaManager broker=1] Error processing > append operation on partition > 0cf2f3faaafd3f906ea848b684b04833ca162bcd19ecae2cab36767a54f248c7-22 > (kafka.server.ReplicaManager) > org.apache.kafka.common.InvalidRecordException: Invalid negative header key > size -25 > [2023-10-19 23:33:47,947] ERROR [ReplicaManager broker=1] Error processing > append operation on partition > c63b6e30987317fad18815effb8d432b6df677d2ab56cf6da517bb93fa49b74b-25 > (kafka.server.ReplicaManager) > org.apache.kafka.common.InvalidRecordException: Found invalid number of > record headers -50 > [2023-10-19 23:33:47,959] ERROR [ReplicaManager broker=1] Error processing > append operation on partition > c63b6e30987317fad18815effb8d432b6df677d2ab56cf6da517bb93fa49b74b-25 > (kafka.server.ReplicaManager) > {noformat} > > I modified franz-go with a diff to print the request that was written to the > wire once this error occurs. Attached is a v9 produce request. I deserialized > it locally and am not seeing the corrupt data that Kafka is printing. It's > possible there is a bug in the client, but again, these tests have never > received this error pre-Kafka 3.6. It _looks like_ there is either corruption > when processing the incoming data, or there is some problematic race > condition in the broker - I'm not sure which. -- This message was sent by Atlassian Jira (v8.20.10#820010)