jeqo commented on PR #14127: URL: https://github.com/apache/kafka/pull/14127#issuecomment-1691250862
@abhijeetk88 would it be possible to rebase these changes to the latest trunk? I have been experiencing some race condition issues with the latest topic delete feature introduced on #13947 that may be related. Just for reference, here's the stack trace: ``` 2023-08-22T20:15:51.4142095Z [2023-08-22 20:14:57,794] DEBUG [Controller id=1] Delete topics listener fired for topics topic1 to be deleted (kafka.controller.KafkaController) 2023-08-22T20:15:51.4142555Z [2023-08-22 20:14:57,794] INFO [Controller id=1] Starting topic deletion for topics topic1 (kafka.controller.KafkaController) 2023-08-22T20:15:51.4143475Z [2023-08-22 20:14:57,796] INFO [Topic Deletion Manager 1] Handling deletion for topics topic1 (kafka.controller.TopicDeletionManager) 2023-08-22T20:15:51.4143970Z [2023-08-22 20:14:57,799] INFO [Topic Deletion Manager 1] Deletion of topic topic1 (re)started (kafka.controller.TopicDeletionManager) 2023-08-22T20:15:51.4144474Z [2023-08-22 20:14:57,800] INFO [Controller id=1 epoch=1] Sending UpdateMetadata request to brokers HashSet() for 0 partitions (state.change.logger) 2023-08-22T20:15:51.4144970Z [2023-08-22 20:14:57,800] INFO [Controller id=1 epoch=1] Sending UpdateMetadata request to brokers HashSet() for 0 partitions (state.change.logger) 2023-08-22T20:15:51.4145466Z [2023-08-22 20:14:57,801] INFO [Controller id=1 epoch=1] Sending UpdateMetadata request to brokers HashSet(1) for 1 partitions (state.change.logger) 2023-08-22T20:15:51.4146237Z [2023-08-22 20:14:57,803] INFO [Broker id=1] Add 0 partitions and deleted 1 partitions from metadata cache in response to UpdateMetadata request sent by controller 1 epoch 1 with correlation id 5 (state.change.logger) 2023-08-22T20:15:51.4146829Z [2023-08-22 20:14:57,804] INFO [GroupCoordinator 1]: Removed 0 offsets associated with deleted partitions: topic1-0. (kafka.coordinator.group.GroupCoordinator) 2023-08-22T20:15:51.4147732Z [2023-08-22 20:14:57,837] INFO [Controller id=1 epoch=1] Partition topic1-0 state changed to (Leader:-1,ISR:1,LeaderRecoveryState:RECOVERED,LeaderEpoch:1,ZkVersion:1,ControllerEpoch:1) after removing replica 1 from the ISR as part of transition to OfflineReplica (state.change.logger) 2023-08-22T20:15:51.4148224Z [2023-08-22 20:14:57,838] INFO [Controller id=1 epoch=1] Sending UpdateMetadata request to brokers HashSet() for 0 partitions (state.change.logger) 2023-08-22T20:15:51.4148680Z [2023-08-22 20:14:57,840] INFO [Controller id=1 epoch=1] Sending StopReplica request for 1 replicas to broker 1 (state.change.logger) 2023-08-22T20:15:51.4149170Z [2023-08-22 20:14:57,847] INFO [Controller id=1 epoch=1] Sending UpdateMetadata request to brokers HashSet() for 0 partitions (state.change.logger) 2023-08-22T20:15:51.4149749Z [2023-08-22 20:14:57,849] INFO [Broker id=1] Handling StopReplica request correlationId 6 from controller 1 for 1 partitions (state.change.logger) 2023-08-22T20:15:51.4150355Z [2023-08-22 20:14:57,851] INFO [ReplicaFetcherManager on broker 1] Removed fetcher for partitions Set(topic1-0) (kafka.server.ReplicaFetcherManager) 2023-08-22T20:15:51.4150951Z [2023-08-22 20:14:57,851] INFO [ReplicaAlterLogDirsManager on broker 1] Removed fetcher for partitions Set(topic1-0) (kafka.server.ReplicaAlterLogDirsManager) 2023-08-22T20:15:51.4151425Z [2023-08-22 20:14:57,854] INFO Cancelling the RLM task for tpId: sWImPLuWRT2k-_kdGS7Utw:topic1-0 (kafka.log.remote.RemoteLogManager) 2023-08-22T20:15:51.4152148Z [2023-08-22 20:14:57,854] INFO Updating assignments for partitions added: [] and removed: [sWImPLuWRT2k-_kdGS7Utw:topic1-0] (org.apache.kafka.server.log.remote.metadata.storage.ConsumerTask) 2023-08-22T20:15:51.4152605Z [2023-08-22 20:14:57,854] INFO [Controller id=1 epoch=1] Sending StopReplica request for 1 replicas to broker 1 (state.change.logger) 2023-08-22T20:15:51.4154127Z [2023-08-22 20:14:57,855] DEBUG Assigned user-topic-partitions: {qnc7kfoRRxazxKHkOwZV0A:topic0-1=UserTopicIdPartition{topicIdPartition=qnc7kfoRRxazxKHkOwZV0A:topic0-1, metadataPartition=21, isInitialized=true, isAssigned=true}, qnc7kfoRRxazxKHkOwZV0A:topic0-0=UserTopicIdPartition{topicIdPartition=qnc7kfoRRxazxKHkOwZV0A:topic0-0, metadataPartition=21, isInitialized=true, isAssigned=true}} (org.apache.kafka.server.log.remote.metadata.storage.ConsumerTask) 2023-08-22T20:15:51.4154624Z [2023-08-22 20:14:57,861] INFO [Broker id=1] Handling StopReplica request correlationId 7 from controller 1 for 1 partitions (state.change.logger) 2023-08-22T20:15:51.4155166Z [2023-08-22 20:14:57,861] INFO [ReplicaFetcherManager on broker 1] Removed fetcher for partitions Set(topic1-0) (kafka.server.ReplicaFetcherManager) 2023-08-22T20:15:51.4155832Z [2023-08-22 20:14:57,862] INFO [ReplicaAlterLogDirsManager on broker 1] Removed fetcher for partitions Set(topic1-0) (kafka.server.ReplicaAlterLogDirsManager) 2023-08-22T20:15:51.4156473Z [2023-08-22 20:14:57,867] INFO Log for partition topic1-0 is renamed to /var/lib/kafka/data/topic1-0.14f5367c6244483cac979113bd9de896-delete and is scheduled for deletion (kafka.log.LogManager) 2023-08-22T20:15:51.4157167Z [2023-08-22 20:14:57,874] INFO [Consumer clientId=__remote_log_metadata_client_1_consumer, groupId=null] Assigned to partition(s): __remote_log_metadata-21 (org.apache.kafka.clients.consumer.KafkaConsumer) 2023-08-22T20:15:51.4158044Z [2023-08-22 20:14:57,874] INFO [Consumer clientId=__remote_log_metadata_client_1_consumer, groupId=null] Seeking to earliest offset of partition __remote_log_metadata-21 (org.apache.kafka.clients.consumer.internals.SubscriptionState) 2023-08-22T20:15:51.4158765Z [2023-08-22 20:14:57,874] INFO [Consumer clientId=__remote_log_metadata_client_1_consumer, groupId=null] Seeking to offset 327 for partition __remote_log_metadata-21 (org.apache.kafka.clients.consumer.KafkaConsumer) 2023-08-22T20:15:51.4159355Z [2023-08-22 20:14:57,875] INFO Unassigned user-topic-partitions: 1 (org.apache.kafka.server.log.remote.metadata.storage.ConsumerTask) 2023-08-22T20:15:51.4159894Z [2023-08-22 20:14:57,878] INFO Deleting the remote log segments task for partition: sWImPLuWRT2k-_kdGS7Utw:topic1-0 (kafka.log.remote.RemoteLogManager) 2023-08-22T20:15:51.4160359Z [2023-08-22 20:14:57,879] ERROR Error while stopping the partition: topic1-0, delete: true (kafka.log.remote.RemoteLogManager) 2023-08-22T20:15:51.4161031Z org.apache.kafka.server.log.remote.storage.RemoteResourceNotFoundException: No resource found for partition: sWImPLuWRT2k-_kdGS7Utw:topic1-0 2023-08-22T20:15:51.4161741Z at org.apache.kafka.server.log.remote.metadata.storage.RemotePartitionMetadataStore.getRemoteLogMetadataCache(RemotePartitionMetadataStore.java:151) 2023-08-22T20:15:51.4162515Z at org.apache.kafka.server.log.remote.metadata.storage.RemotePartitionMetadataStore.listRemoteLogSegments(RemotePartitionMetadataStore.java:137) 2023-08-22T20:15:51.4163349Z at org.apache.kafka.server.log.remote.metadata.storage.TopicBasedRemoteLogMetadataManager.listRemoteLogSegments(TopicBasedRemoteLogMetadataManager.java:241) 2023-08-22T20:15:51.4163720Z at kafka.log.remote.RemoteLogManager.deleteRemoteLogPartition(RemoteLogManager.java:391) 2023-08-22T20:15:51.4164030Z at kafka.log.remote.RemoteLogManager.lambda$stopPartitions$8(RemoteLogManager.java:375) 2023-08-22T20:15:51.4164243Z at java.base/java.lang.Iterable.forEach(Iterable.java:75) 2023-08-22T20:15:51.4164555Z at kafka.log.remote.RemoteLogManager.stopPartitions(RemoteLogManager.java:366) 2023-08-22T20:15:51.4164832Z at kafka.server.ReplicaManager.$anonfun$stopPartitions$5(ReplicaManager.scala:596) 2023-08-22T20:15:51.4165126Z at kafka.server.ReplicaManager.$anonfun$stopPartitions$5$adapted(ReplicaManager.scala:592) 2023-08-22T20:15:51.4165284Z at scala.Option.foreach(Option.scala:437) 2023-08-22T20:15:51.4165571Z at kafka.server.ReplicaManager.stopPartitions(ReplicaManager.scala:592) 2023-08-22T20:15:51.4165846Z at kafka.server.ReplicaManager.stopReplicas(ReplicaManager.scala:510) 2023-08-22T20:15:51.4166134Z at kafka.server.KafkaApis.handleStopReplicaRequest(KafkaApis.scala:306) 2023-08-22T20:15:51.4166344Z at kafka.server.KafkaApis.handle(KafkaApis.scala:185) 2023-08-22T20:15:51.4166611Z at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:146) 2023-08-22T20:15:51.4166797Z at java.base/java.lang.Thread.run(Thread.java:829) 2023-08-22T20:15:51.4167529Z [2023-08-22 20:14:57,880] INFO Updating assignments for partitions added: [] and removed: [sWImPLuWRT2k-_kdGS7Utw:topic1-0] (org.apache.kafka.server.log.remote.metadata.storage.ConsumerTask) 2023-08-22T20:15:51.4168823Z [2023-08-22 20:14:57,881] ERROR [Broker id=1] Ignoring StopReplica request (delete=true) from controller 1 with correlation id 7 epoch 1 for partition topic1-0 due to an unexpected org.apache.kafka.server.log.remote.storage.RemoteResourceNotFoundException exception: No resource found for partition: sWImPLuWRT2k-_kdGS7Utw:topic1-0 (state.change.logger) 2023-08-22T20:15:51.4169584Z [2023-08-22 20:14:57,882] DEBUG [Controller id=1] Delete topic callback invoked on StopReplica response received from broker 1: request error = NONE, partition errors = Map(topic1-0 -> UNKNOWN_SERVER_ERROR) (kafka.controller.KafkaController) ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org