Matthias J. Sax created KAFKA-3981: -------------------------------------- Summary: Possible race condition between controller cache and ZK on topic delete Key: KAFKA-3981 URL: https://issues.apache.org/jira/browse/KAFKA-3981 Project: Kafka Issue Type: Bug Reporter: Matthias J. Sax Priority: Minor
In an integration test, I delete some topics and check with the following code if the topics got deleted: {noformat} final Set<String> expectedTopics = new HashSet<>(); expectedTopics.add(INPUT_TOPIC); expectedTopics.add(INTERMEDIATE_USER_TOPIC); expectedTopics.add(OUTPUT_TOPIC); expectedTopics.add("__consumer_offsets"); Set<String> allTopics; ZkUtils zkUtils = null; try { zkUtils = ZkUtils.apply(CLUSTER.zKConnectString(), 30000, 30000, JaasUtils.isZkSecurityEnabled()); do { allTopics = new HashSet<>(); alllTopics.addAll(scala.collection.JavaConversions.seqAsJavaList(zkUtils.getAllTopics())); } while (allTopics.size() != expectedTopics.size()); } finally { if (zkUtils != null) { zkUtils.close(); } } assertThat(allTopics, equalTo(expectedTopics)); {noformat} However, after the loop terminates (and {{assertThat}} passes), I try to re-create one of the deleted topics I get: {noformat} java.lang.IllegalStateException: Partition [cleanup-integration-test-KSTREAM-MAP-0000000011-repartition,0] should be in the NonExistentPartition states before moving to NewPartition state. Instead it is in OnlinePartition state {noformat} After discussion with [~guozhang] I got the following answer: bq. I think I know the reason: the zk path of the topic is deleted by the background delete thread (see completeDeleteTopic in TopicDeletionManager), and once that is done, zkUtils.getAllTopics() will not have this topic any more;But the controller cache will only be clear this entry after that, by triggering the listener on this zk path, so there is still a window of race condition that the controller still have this entry in metadata. -- This message was sent by Atlassian JIRA (v6.3.4#6332)