[jira] [Created] (KAFKA-12854) Add a config to allow skipping metadata cache update when topic partition is unassigned
Lincong Li created KAFKA-12854: -- Summary: Add a config to allow skipping metadata cache update when topic partition is unassigned Key: KAFKA-12854 URL: https://issues.apache.org/jira/browse/KAFKA-12854 Project: Kafka Issue Type: Improvement Components: clients Reporter: Lincong Li The "assign" method in the consumer triggers a metadata cache update if the new partition assignment is different from the current assignment. It makes sense to update the MD cache if the new assignment contains partitions that do not exist in the current assignment. However, I wonder why is updating its MD cache necessary if the new partition assignment is a subset of the current assignment. For example, the new assignment is tp0, tp1, and the current assignment is tp0, tp1, tp2. The current behavior does too many drawbacks in most cases. However, if the number of consumer instances is large and each consumer instance is constantly getting some topic partitions unassigned, the QPS of the MD request sent out to update the MD cache becomes high as a result. Proposed changes: Add a config to allow skipping metadata cache update when topic partition(s) is unassigned Existing PR to a forked repo: https://github.com/linkedin/kafka/pull/166/files -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KAFKA-10823) AdminOperationException has no error code
Lincong Li created KAFKA-10823: -- Summary: AdminOperationException has no error code Key: KAFKA-10823 URL: https://issues.apache.org/jira/browse/KAFKA-10823 Project: Kafka Issue Type: Bug Reporter: Lincong Li The AdminOperationException is one kind of RuntimeException and the fact that it does not have error code prevents proper handling the exception. For example, the AdminOperationException could be thrown when AdminZkClient.changeTopicConfig(...) method is invoked and the topic for which config change is requested does not exist. When the AdminOperationException is thrown, the caller of AdminZkClient.changeTopicConfig(...) can choose to catch it. But it cannot programmatically figure out what exactly is the cause of this exception. Hence nothing can be done safely. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KAFKA-10606) Auto create non-existent topics when fetching metadata for all topics
Lincong Li created KAFKA-10606: -- Summary: Auto create non-existent topics when fetching metadata for all topics Key: KAFKA-10606 URL: https://issues.apache.org/jira/browse/KAFKA-10606 Project: Kafka Issue Type: Bug Reporter: Lincong Li The "allow auto topic creation" flag is hardcoded to be true for the fetch-all-topic metadata request: https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/common/requests/MetadataRequest.java#L37 In the below code, annotation claims that "*This never causes auto-creation*". It it NOT true and auto topic creation still gets triggered under some circumstances. So, this is a bug that needs to be fixed. https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/common/requests/MetadataRequest.java#L68 For example, the bug could be manifested in the below situation: A topic T is being deleted and a request to fetch metadata for all topics gets sent to one broker. The broker reads names of all topics from its metadata cache (shown below). https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/server/KafkaApis.scala#L1196 Then the broker authorizes all topics and makes sure that they are allowed to be described. Then the broker tries to get metadata for every authorized topic by reading the metadata cache again, once for every topic (show below). https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/server/KafkaApis.scala#L1240 However, the metadata cache could have been updated while the broker was authorizing all topics and topic T and its metadata no longer exist in the cache since the topic got deleted and metadata update requests eventually got propagated from the controller to all brokers. So, at this point, when the broker tries to get metadata for topic T from its cache, it realizes that it does not exist and the broker tries to "auto create" topic T since the allow-auto-topic-creation flag was set to true in all the fetch-all-topic metadata requests. I think this bug exists since "*metadataRequest.allowAutoTopicCreation*" was introduced. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KAFKA-7596) Add observer interface to record request and response
Lincong Li created KAFKA-7596: - Summary: Add observer interface to record request and response Key: KAFKA-7596 URL: https://issues.apache.org/jira/browse/KAFKA-7596 Project: Kafka Issue Type: Improvement Reporter: Lincong Li Assignee: Lincong Li The interface could be used in the KafkaApis class to record each request-response pair. The motivation of introducing this observer is to enable or improve a Kafka audit system. Details are discussed in -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KAFKA-7573) Add an interface that allows broker to intercept every request/response pair
Lincong Li created KAFKA-7573: - Summary: Add an interface that allows broker to intercept every request/response pair Key: KAFKA-7573 URL: https://issues.apache.org/jira/browse/KAFKA-7573 Project: Kafka Issue Type: Improvement Components: core Affects Versions: 2.0.0 Reporter: Lincong Li Assignee: Lincong Li This interface is called "observer" and it opens up several opportunities. One major opportunity is that it enables an auditing system to be built for Kafka deployment. Details are discussed in a KIP. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KAFKA-7196) Remove heartbeat delayed operation for those removed consumers at the end of each rebalance
Lincong Li created KAFKA-7196: - Summary: Remove heartbeat delayed operation for those removed consumers at the end of each rebalance Key: KAFKA-7196 URL: https://issues.apache.org/jira/browse/KAFKA-7196 Project: Kafka Issue Type: Bug Components: core, purgatory Reporter: Lincong Li During the consumer group rebalance, when the joining group phase finishes, the heartbeat delayed operation of the consumer that fails to rejoin the group should be removed from the purgatory. Otherwise, even though the member ID of the consumer has been removed from the group, its heartbeat delayed operation is still registered in the purgatory and the heartbeat delayed operation is going to timeout and then another unnecessary rebalance is triggered because of it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)