rdhabalia opened a new pull request #4173: [pulsar-broker]Fix: race condition while deleting global topic URL: https://github.com/apache/pulsar/pull/4173 ### Motivation When client removes local-cluster from the replication-cluster-list, broker deletes the topic from that cluster. However, sometime due to race-condition when broker receives multiple zk-watch for policies update, broker doesn't fence topic safely which allows multiple threads to delete the same topic. In that case, first thread successfully deletes the topic and second thread fails and keep retrying to delete the same topic continuously and logs below error message. ``` 18:41:00.190 [bookkeeper-ml-workers-OrderedExecutor-5-0] ERROR org.apache.pulsar.broker.service.persistent.PersistentTopic - [persistent://my-prop/global/my-namespace/8:] Error deleting topic org.apache.bookkeeper.mledger.ManagedLedgerException$MetaStoreException: org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode Caused by: org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode at org.apache.zookeeper.KeeperException.create(KeeperException.java:114) ~[pulsar-broker-2.2.8-yahoo.jar:2.2.8-yahoo] at org.apache.bookkeeper.mledger.impl.MetaStoreImplZookeeper.lambda$null$108(MetaStoreImplZookeeper.java:325) ~[managed-ledger-original-2.2.8-yahoo.jar:2.2.8-yahoo] at org.apache.bookkeeper.mledger.util.SafeRun$1.safeRun(SafeRun.java:32) [managed-ledger-original-2.2.8-yahoo.jar:2.2.8-yahoo] at org.apache.bookkeeper.common.util.SafeRunnable.run(SafeRunnable.java:36) [bookkeeper-common-4.7.2.jar:4.7.2] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_181] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_181] at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [pulsar-functions-metrics-2.2.8-yahoo.jar:?] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_181] 18:41:00.190 [bookkeeper-ml-workers-OrderedExecutor-5-0] ERROR org.apache.pulsar.broker.service.persistent.PersistentTopic - [persistent://my-prop/global/my-namespace/8:] Policies update failed org.apache.pulsar.broker.service.BrokerServiceException$PersistenceException: org.apache.bookkeeper.mledger.ManagedLedgerException$MetaStoreException: org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode, scheduled retry in 60 seconds java.util.concurrent.CompletionException: org.apache.pulsar.broker.service.BrokerServiceException$PersistenceException: org.apache.bookkeeper.mledger.ManagedLedgerException$MetaStoreException: org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292) ~[?:1.8.0_181] at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308) ~[?:1.8.0_181] at java.util.concurrent.CompletableFuture.uniAccept(CompletableFuture.java:647) ~[?:1.8.0_181] at java.util.concurrent.CompletableFuture$UniAccept.tryFire(CompletableFuture.java:632) ~[?:1.8.0_181] at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474) ~[?:1.8.0_181] at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977) ~[?:1.8.0_181] at org.apache.pulsar.broker.service.persistent.PersistentTopic$4.deleteLedgerFailed(PersistentTopic.java:782) ~[pulsar-broker-2.2.8-yahoo.jar:2.2.8-yahoo] at org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl$17.operationFailed(ManagedLedgerImpl.java:2010) ~[managed-ledger-original-2.2.8-yahoo.jar:2.2.8-yahoo] at org.apache.bookkeeper.mledger.impl.MetaStoreImplZookeeper.lambda$null$108(MetaStoreImplZookeeper.java:325) ~[managed-ledger-original-2.2.8-yahoo.jar:2.2.8-yahoo] at org.apache.bookkeeper.mledger.util.SafeRun$1.safeRun(SafeRun.java:32) [managed-ledger-original-2.2.8-yahoo.jar:2.2.8-yahoo] at org.apache.bookkeeper.common.util.SafeRunnable.run(SafeRunnable.java:36) [bookkeeper-common-4.7.2.jar:4.7.2] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_181] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_181] at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [pulsar-functions-m ``` ### Modification - Fence topic safely and handle topic-already-deleted exception successfully.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
