[ https://issues.apache.org/jira/browse/KAFKA-19596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Luke Chen resolved KAFKA-19596. ------------------------------- Fix Version/s: 4.2.0 Resolution: Fixed > Log auto topic creation failures more visibly > --------------------------------------------- > > Key: KAFKA-19596 > URL: https://issues.apache.org/jira/browse/KAFKA-19596 > Project: Kafka > Issue Type: Improvement > Components: logging > Reporter: Rob Young > Assignee: Rob Young > Priority: Trivial > Fix For: 4.2.0 > > > Hi, I was playing with Share groups on the 4.1.0-rc2 and found it a little > opaque to detect a failure to auto create the `__share_group_state` topic due > to the replication factor not being satisfiable. > I start up a cluster like this (taken from the dockerhub instructions): > {code:java} > podman run --rm \ > -p 9092:9092 \ > -e KAFKA_NODE_ID=1 \ > -e KAFKA_PROCESS_ROLES=broker,controller \ > -e KAFKA_LISTENERS=PLAINTEXT://:9092,CONTROLLER://:9093 \ > -e KAFKA_ADVERTISED_LISTENERS=PLAINTEXT://localhost:9092 \ > -e KAFKA_CONTROLLER_LISTENER_NAMES=CONTROLLER \ > -e > KAFKA_LISTENER_SECURITY_PROTOCOL_MAP=CONTROLLER:PLAINTEXT,PLAINTEXT:PLAINTEXT > \ > -e KAFKA_CONTROLLER_QUORUM_VOTERS=1@localhost:9093 \ > -e KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR=1 \ > -e KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR=1 \ > -e KAFKA_TRANSACTION_STATE_LOG_MIN_ISR=1 \ > -e KAFKA_GROUP_INITIAL_REBALANCE_DELAY_MS=0 \ > -e KAFKA_NUM_PARTITIONS=3 \ > apache/kafka:4.1.0-rc2{code} > and then run the commands: > {code:java} > robeyoun:kafka_2.13-4.1.0$ bin/kafka-features.sh --bootstrap-server > localhost:9092 upgrade --feature share.version=1 > share.version was upgraded to 1. > robeyoun:kafka_2.13-4.1.0$ bin/kafka-topics.sh --create --topic my_topic > --bootstrap-server localhost:9092 > WARNING: Due to limitations in metric names, topics with a period ('.') or > underscore ('_') could collide. To avoid issues it is best to use either, but > not both. > Created topic my_topic. > robeyoun:kafka_2.13-4.1.0$ bin/kafka-console-share-consumer.sh > --bootstrap-server localhost:9092 --topic quickstart-events > [2025-08-12 09:50:13,130] WARN Share groups and KafkaShareConsumer are part > of a preview feature introduced by KIP-932, and are not recommended for use > in production. > (org.apache.kafka.clients.consumer.internals.ShareConsumerDelegateCreator) > [2025-08-12 09:50:13,357] WARN [ShareConsumer > clientId=console-share-consumer, groupId=console-share-consumer] The metadata > response from the cluster reported a recoverable issue with correlation id 20 > : {quickstart-events=UNKNOWN_TOPIC_OR_PARTITION} > (org.apache.kafka.clients.NetworkClient) > {code} > the consumer now sits there, appearing to be consuming happily, but the > broker is emitting a stream of logs like: > {code:java} > [2025-08-11 21:51:03,648] INFO Sent auto-creation request for > Set(__share_group_state) to the active controller. > (kafka.server.DefaultAutoTopicCreationManager) > [2025-08-11 21:51:03,648] WARN Received retriable error in find coordinator > for InitializeStateHandler using key > SharePartitionKey{groupId=console-share-consumer, > topicIdPartition=sI9vYBBGSKW_3BfG9ZJWhg:null-1}: The coordinator is not > available. > (org.apache.kafka.server.share.persister.PersisterStateManager$InitializeStateHandler){code} > So there's a clue there that it's repeatedly trying to auto-create the topic. > the underlying problem is the default replication factor of 3 for the > __share_group_state topic, but I have to crank up the log level to DEBUG > (setting -e KAFKA_LOG4J_LOGGERS="kafka=DEBUG") to see the cause: > {code:java} > [2025-08-11 21:53:58,343] INFO Sent auto-creation request for > Set(__share_group_state) to the active controller. > (kafka.server.DefaultAutoTopicCreationManager) > [2025-08-11 21:53:58,343] WARN Received retriable error in find coordinator > for InitializeStateHandler using key > SharePartitionKey{groupId=console-share-consumer, > topicIdPartition=8kjLkvNzSCy7geIIriJVCQ:null-2}: The coordinator is not > available. > (org.apache.kafka.server.share.persister.PersisterStateManager$InitializeStateHandler) > [2025-08-11 21:53:58,345] DEBUG > [broker-1-to-controller-forwarding-channel-manager]: Request > CreateTopicsRequestData(topics=[CreatableTopic(name='__share_group_state', > numPartitions=50, replicationFactor=3, assignments=[], > configs=[CreatableTopicConfig(name='compression.type', value='producer'), > CreatableTopicConfig(name='cleanup.policy', value='delete'), > CreatableTopicConfig(name='min.insync.replicas', value='2'), > CreatableTopicConfig(name='segment.bytes', value='104857600'), > CreatableTopicConfig(name='retention.ms', value='-1')])], timeoutMs=30000, > validateOnly=false) received ClientResponse(receivedTimeMs=1754949238345, > latencyMs=2, disconnected=false, timedOut=false, > requestHeader=RequestHeader(apiKey=CREATE_TOPICS, apiVersion=7, clientId=1, > correlationId=22, headerVersion=2), > responseBody=CreateTopicsResponseData(throttleTimeMs=0, > topics=[CreatableTopicResult(name='__share_group_state', > topicId=AAAAAAAAAAAAAAAAAAAAAA, errorCode=38, errorMessage='Unable to > replicate the partition 3 time(s): The target replication factor of 3 cannot > be reached because only 1 broker(s) are registered.', topicConfigErrorCode=0, > numPartitions=-1, replicationFactor=-1, configs=[])])) > (kafka.server.NodeToControllerRequestThread) > [2025-08-11 21:53:58,345] DEBUG Cleared inflight topic creation state for > HashMap(__share_group_state -> CreatableTopic(name='__share_group_state', > numPartitions=50, replicationFactor=3, assignments=[], > configs=[CreatableTopicConfig(name='compression.type', value='producer'), > CreatableTopicConfig(name='cleanup.policy', value='delete'), > CreatableTopicConfig(name='min.insync.replicas', value='2'), > CreatableTopicConfig(name='segment.bytes', value='104857600'), > CreatableTopicConfig(name='retention.ms', value='-1')])) > (kafka.server.DefaultAutoTopicCreationManager) > [2025-08-11 21:53:58,345] DEBUG Auto topic creation completed for > Set(__share_group_state) with response > CreateTopicsResponseData(throttleTimeMs=0, > topics=[CreatableTopicResult(name='__share_group_state', > topicId=AAAAAAAAAAAAAAAAAAAAAA, errorCode=38, errorMessage='Unable to > replicate the partition 3 time(s): The target replication factor of 3 cannot > be reached because only 1 broker(s) are registered.', topicConfigErrorCode=0, > numPartitions=-1, replicationFactor=-1, configs=[])]). > (kafka.server.DefaultAutoTopicCreationManager) > {code} > The point where we log the auto creation outcome has two cases where it warn > logs, but then this case is being handled by a catch all debug log. > [https://github.com/apache/kafka/blob/18045c6ac30921503deffbef1744bb365dc599fb/core/src/main/scala/kafka/server/AutoTopicCreationManager.scala#L141] > If there's an error code set, maybe it should log at warn like the other > cases? I'm happy to be assigned to this. -- This message was sent by Atlassian Jira (v8.20.10#820010)