[ https://issues.apache.org/jira/browse/KAFKA-14312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Colin McCabe resolved KAFKA-14312. ---------------------------------- Resolution: Won't Fix Based on the discussion, this behavior isn't unique to KRaft, and isn't a bug. If you want a different behavior for sequence numbers, consider filing a KIP. > Kraft + ProducerStateManager: produce requests to new partitions with a > non-zero sequence number should be rejected > ------------------------------------------------------------------------------------------------------------------- > > Key: KAFKA-14312 > URL: https://issues.apache.org/jira/browse/KAFKA-14312 > Project: Kafka > Issue Type: Bug > Components: kraft, producer > Reporter: Travis Bischel > Priority: Major > > h1. Background > In Kraft mode, if I create a topic, I am occasionally seeing MetadataResponse > with a valid leader, and if I immediately produce to that topic, I am seeing > NOT_LEADER_FOR_PARTITION. There may be another bug causing Kraft to return a > leader in metadata but reject requests to that leader, _but_ this is showing > a bigger problem. > Kafka currently accepts produce requests to new partitions with a non-zero > sequence number. I have confirmed this locally by modifying my client to > start producing with a sequence number of 10. Producing three records > sequentially back to back (seq 10, 11, 12) are all successful. I _think_ this > [comment|https://github.com/apache/kafka/blob/3e7eddecd6a63ea6a9793d3270bef6d0be5c9021/core/src/main/scala/kafka/log/ProducerStateManager.scala#L235-L236] > in the Kafka source also indicates roughly the same thing. > h1. Problem > * Client initializes producer ID > * Client creates topic "foo" (for the problem, we will ignore partitions – > there is just one partition) > * Client sends produce request A with 5 records > * Client sends produce request B with 5 records before receiving a response > for A > * Broker returns NOT_LEADER_FOR_PARTITION to produce request A > * Broker finally initializes, becomes leader before seeing request B > * Broker accepts request B as the first request > * Broker believes sequence number 5 is ok, and is expecting the next > sequence to be 10 > * Client retries requests A and B, because A failed > * Broker sees request A with sequence 0, returns OutOfOrderSequenceException > * Client enters a fatal state, because OOOSN is not retryable > h1. Reproducing > I can reliably reproduce this error using Kraft mode with 1 broker. I am > using the following docker compose: > {{version: "3.7"}} > {{services:}} > {{ kafka:}} > {{ image: bitnami/kafka:latest}} > {{ network_mode: host}} > {{ environment:}} > {{ KAFKA_ENABLE_KRAFT: yes}} > {{ KAFKA_CFG_PROCESS_ROLES: controller,broker}} > {{ KAFKA_CFG_CONTROLLER_LISTENER_NAMES: CONTROLLER}} > {{ KAFKA_CFG_LISTENERS: PLAINTEXT://:9092,CONTROLLER://:9093}} > {{ KAFKA_CFG_LISTENER_SECURITY_PROTOCOL_MAP: > CONTROLLER:PLAINTEXT,PLAINTEXT:PLAINTEXT}} > {{ KAFKA_CFG_CONTROLLER_QUORUM_VOTERS: 1@127.0.0.1:9093}} > {{ # Set this to "PLAINTEXT://127.0.0.1:9092" if you want to run this > container on localhost via Docker}} > {{ KAFKA_CFG_ADVERTISED_LISTENERS: PLAINTEXT://127.0.0.1:9092}} > {{ KAFKA_CFG_BROKER_ID: 1}} > {{ ALLOW_PLAINTEXT_LISTENER: yes}} > {{ KAFKA_KRAFT_CLUSTER_ID: XkpGZQ27R3eTl3OdTm2LYA # 16 byte > base64-encoded UUID}} > {{ BITNAMI_DEBUG: true # Enable this to get more info on startup > failures}} > > I am running the franz-go integration tests to trigger this (frequently, but > not all of the time). However, these tests are not required. The behavior > described above can occasionally reproduce this. > I have never experienced this against the zookeeper version. It seems that > the zk version always fully initializes a topic immediately and does not > return NOT_LEADER_FOR_PARTITION on the first produce request. This is a > separate problem – but the main problem described above exists in all > versions, and _can_ be experienced in zk in very strange circumstances. -- This message was sent by Atlassian Jira (v8.20.10#820010)