xlu233 opened a new issue, #20405: URL: https://github.com/apache/pulsar/issues/20405
### Search before asking - [X] I searched in the [issues](https://github.com/apache/pulsar/issues) and found nothing similar. ### Version OS: eulerosv2r7 Pulsar Version: 2.10.0 **Background** when broker and zookeeper broken down at the same time, sometimes the operation of deleting topic got failed, specifically, broker list is [153, 154, 155], i fault the broker 154 and the zookeeper conneted to the 153 and 155. **In a case**, proxy send topic delete request to broker 155, at the time broker 155 add all brokers as the availableBrokers, the cache /loadbalance/brokers is not updated. So broker 155 redirect request to 154 and proxy cause 502 error ``` 2023-05-24T07:20:17,211+0800 [pulsar-external-web-5-6] INFO org.eclipse.jetty.server.RequestLog - 10.75.247.191 - - [24/May/2023:07:20:17 +0800] "DELETE /admin/v2/persistent/new_tenant_1/new_namespace_1/new_topic_1/partitions HTTP/1.1" 502 0 "-" "-" 12 ``` here is log in broker 155 ``` 2023-05-24T07:20:12,170+0800 [main-EventThread] DEBUG org.apache.pulsar.common.naming.NamespaceBundleFactory - [new_tenant_1/new_namespace_1] Get bundles from getLocalZkCacheService: bundles: BundlesDataImpl(boundaries=[0x00000000, 0x40000000, 0x80000000, 0xc0000000, 0xffffffff], numBundles=4), version: Optional[0] 2023-05-24T07:20:12,170+0800 [main-EventThread] DEBUG org.apache.pulsar.broker.namespace.NamespaceService - findBrokerServiceUrl: new_tenant_1/new_namespace_1/0xc0000000_0xffffffff - options: LookupOptions(authoritative=false, readOnly=false, loadTopicsInBundle=false, requestHttps=false, advertisedListenerName=null) 2023-05-24T07:20:12,174+0800 [main-SendThread(gemaspulsar.zookeeper.replicator1.cn-north-7.myhuaweicloud.com:2181)] DEBUG org.apache.zookeeper.ClientCnxn - Reading reply session id: 0x1018b969af30001, packet:: clientPath:null serverPath:null finished:false header:: 2293,22 replyHeader:: 2293,2649994821664,0 request:: org.apache.zookeeper.MultiOperationRecord@5a4a274b response:: org.apache.zookeeper.MultiResponse@ffffff9b 2023-05-24T07:20:12,174+0800 [pulsar-2-5] DEBUG org.apache.pulsar.broker.loadbalance.impl.LoadManagerShared - Added Shared Broker - [10.172.167.154] as possible Candidates for namespace - [new_tenant_1/new_namespace_1] 2023-05-24T07:20:12,174+0800 [pulsar-2-5] DEBUG org.apache.pulsar.broker.loadbalance.impl.LoadManagerShared - Added Shared Broker - [10.172.167.153] as possible Candidates for namespace - [new_tenant_1/new_namespace_1] 2023-05-24T07:20:12,174+0800 [pulsar-2-5] DEBUG org.apache.pulsar.broker.loadbalance.impl.LoadManagerShared - Added Shared Broker - [10.172.167.155] as possible Candidates for namespace - [new_tenant_1/new_namespace_1] 2023-05-24T07:20:12,174+0800 [pulsar-2-5] DEBUG org.apache.pulsar.broker.loadbalance.impl.LoadManagerShared - Policies not present for namespace - [new_tenant_1/new_namespace_1] so only considering shared [3] brokers for possible owner 2023-05-24T07:20:12,174+0800 [pulsar-2-5] INFO org.apache.pulsar.broker.loadbalance.impl.ModularLoadManagerImpl - 3 brokers being considered for assignment of new_tenant_1/new_namespace_1/0xc0000000_0xffffffff 2023-05-24T07:20:12,175+0800 [pulsar-2-5] DEBUG org.apache.pulsar.broker.loadbalance.impl.LeastLongTermMessageRate - Broker http://10.172.167.154:8080 has long term message rate 200.0 2023-05-24T07:20:12,175+0800 [pulsar-2-5] DEBUG org.apache.pulsar.broker.loadbalance.impl.LeastLongTermMessageRate - Broker http://10.172.167.153:8080 has long term message rate 200.00051005213578 2023-05-24T07:20:12,175+0800 [pulsar-2-5] DEBUG org.apache.pulsar.broker.loadbalance.impl.LeastLongTermMessageRate - Broker http://10.172.167.155:8080 has long term message rate 200.0 2023-05-24T07:20:12,175+0800 [pulsar-2-5] DEBUG org.apache.pulsar.broker.loadbalance.impl.ModularLoadManagerImpl - Selected broker Optional[10.172.167.154:8080] from candidate brokers [10.172.167.154:8080, 10.172.167.153:8080, 10.172.167.155:8080] 2023-05-24T07:20:12,175+0800 [pulsar-2-5] DEBUG org.apache.pulsar.broker.namespace.NamespaceService - http://10.172.167.155:8080 : redirecting to the least loaded broker, lookup address=http://10.172.167.154:8080 2023-05-24T07:20:12,175+0800 [pulsar-2-5] DEBUG org.apache.pulsar.broker.namespace.NamespaceService - Redirecting to broker http://10.172.167.154:8080 to acquire ownership of bundle new_tenant_1/new_namespace_1/0xc0000000_0xffffffff ``` **another case** is proxy sending delete request to broker 155, broker 155 find 154 is active because broker thinks broker 154 is in availableBrokers(see it at function **isBrokerActive**), the cache availableBrokers is not update, and then it will redict lookup request to broker 154, but the fact broker 154 is broken. here is the log: ``` 2023-05-25T13:20:20,131+0800 [pulsar-web-35-16] DEBUG org.apache.pulsar.common.naming.NamespaceBundleFactory - Loading cache with bundles for new_tenant_1/new_namespace_1 2023-05-25T13:20:20,135+0800 [main-EventThread] DEBUG org.apache.pulsar.common.naming.NamespaceBundleFactory - [new_tenant_1/new_namespace_1] Get bundles from getLocalZkCacheService: bundles: BundlesDataImpl(boundaries=[0x00000000, 0x40000000, 0x80000000, 0xc0000000, 0xffffffff], numBundles=4), version: Optional[0] 2023-05-25T13:20:20,135+0800 [main-EventThread] DEBUG org.apache.pulsar.broker.namespace.NamespaceService - findBrokerServiceUrl: new_tenant_1/new_namespace_1/0xc0000000_0xffffffff - options: LookupOptions(authoritative=false, readOnly=false, loadTopicsInBundle=false, requestHttps=false, advertisedListenerName=null) 2023-05-25T13:20:20,140+0800 [pulsar-2-8] DEBUG org.apache.pulsar.broker.namespace.NamespaceService - Broker http://10.172.167.154:8080 (10.172.167.154:8080) **is available for**. 5-25T13:20:20,140+0800 [pulsar-2-8] DEBUG org.apache.pulsar.broker.namespace.NamespaceService - Redirecting to broker http://10.172.167.154:8080 to acquire ownership of bundle new_tenant_1/new_namespace_1/0xc0000000_0xffffffff 5-25T13:20:20,140+0800 [pulsar-2-8] pache.pulsar.broker.web.PulsarWebResource - Redirecting the rest call to http://10.172.167.154:8080/admin/v2/persistent/new_tenant_1/new_namespace_1/new_topic_1/partitions?authoritative=false ``` ``` private boolean isBrokerActive(String candidateBroker) { String candidateBrokerHostAndPort = parseHostAndPort(candidateBroker); Set<String> availableBrokers = getAvailableBrokers(); if (availableBrokers.contains(candidateBrokerHostAndPort)) { if (LOG.isDebugEnabled()) { LOG.debug("Broker {} ({}) is available for.", candidateBroker, candidateBrokerHostAndPort); } return true; } else { LOG.warn("Broker {} ({}) couldn't be found in available brokers {}", candidateBroker, candidateBrokerHostAndPort, availableBrokers.stream().collect(Collectors.joining(","))); return false; } } ``` **one more** the cache /loadbalance/leader sometime will not update when leader broker broken down, as you can see following ``` 2023-05-24T08:20:09,675+0800 [pulsar-web-35-1] DEBUG org.apache.pulsar.broker.namespace.NamespaceService - findBrokerServiceUrl: new_tenant_1/new_namespace_1/0x40000000_0x80000000 - options: LookupOptions(authoritative=false, readOnly=false, loadTopicsInBundle=false, requestHttps=false, advertisedListenerName=null) 2023-05-24T08:20:09,675+0800 [pulsar-2-5] WARN org.apache.pulsar.broker.namespace.NamespaceService - Broker http://10.172.167.154:8080 (10.172.167.154:8080) couldn't be found in available brokers 10.172.167.153:8080,10.172.167.155:8080 2023-05-24T08:20:09,675+0800 [pulsar-2-5] WARN org.apache.pulsar.broker.namespace.NamespaceService - The current leader broker LeaderBroker(serviceUrl=http://10.172.167.154:8080) isn't active. Handling load manager decisions in a decentralized way. NamespaceBundle[new_tenant_1/new_namespace_1/0x40000000_0x80000000] ``` I think the reason is pulsar broker miss the watch event, when zk client lost connection and broker broken down before connection recovered, the function addWatch could not trigger watch event. ### Minimal reproduce step To valid the hypothesis,I reproduced the problem by following the steps below 1. first we create a zk client and register listenser as following code, we only connect to one IP in the zk cluster. ``` MetadataStoreExtended client = MetadataStoreExtended.create("10.155.192.47:2181", MetadataStoreConfig.builder().build()); client.registerListener(notification -> { if ("/test".equals(notification.getPath()) && NotificationType.ChildrenChanged.equals(notification.getType())){ client.getChildren("/test").whenComplete((nodes, ex) ->{ System.out.println("node ChildrenChanged: " + nodes); }); } }); ``` 2. Disconnect the zk server the client connect to 3. create node in another zk node 4. recover the zk server the client connect to ### What did you expect to see? We should be able to see that zk client can receive the watch events during the disconnection time, but we can not get change. I found discussion in https://github.com/apache/zookeeper/pull/1106 ### What did you see instead? not any event recieved after the connection recover ``` 2023-05-26 11:51:27,661 - WARN - [metadata-store-zk-session-watcher-2-1:ZKSessionWatcher@158] - ZooKeeper client is disconnected. Waiting to reconnect, time remaining = 9.68 seconds sessionID = 305aaad27c20001 2023-05-26 11:51:29,677 - WARN - [metadata-store-zk-session-watcher-2-1:ZKSessionWatcher@113] - timeout check zk state, use Disconnected, sessionId is 305aaad27c20001, zk state is CONNECTING 2023-05-26 11:51:29,677 - WARN - [metadata-store-zk-session-watcher-2-1:ZKSessionWatcher@158] - ZooKeeper client is disconnected. Waiting to reconnect, time remaining = 7.664 seconds sessionID = 305aaad27c20001 2023-05-26 11:51:31,691 - WARN - [metadata-store-zk-session-watcher-2-1:ZKSessionWatcher@113] - timeout check zk state, use Disconnected, sessionId is 305aaad27c20001, zk state is CONNECTING 2023-05-26 11:51:31,691 - WARN - [metadata-store-zk-session-watcher-2-1:ZKSessionWatcher@158] - ZooKeeper client is disconnected. Waiting to reconnect, time remaining = 5.65 seconds sessionID = 305aaad27c20001 2023-05-26 11:51:33,399 - INFO - [main-SendThread(10.155.192.47:2181):ClientCnxn$SendThread@1171] - Opening socket connection to server 10.155.192.47/10.155.192.47:2181. 2023-05-26 11:51:33,399 - INFO - [main-SendThread(10.155.192.47:2181):ClientCnxn$SendThread@1173] - SASL config status: Will not attempt to authenticate using SASL (unknown error) 2023-05-26 11:51:33,428 - INFO - [main-SendThread(10.155.192.47:2181):ClientCnxn$SendThread@1005] - Socket connection established, initiating session, client: /10.78.75.128:55422, server: 10.155.192.47/10.155.192.47:2181 2023-05-26 11:51:33,464 - INFO - [main-SendThread(10.155.192.47:2181):ClientCnxn$SendThread@1446] - Session establishment complete on server 10.155.192.47/10.155.192.47:2181, session id = 0x305aaad27c20001, negotiated timeout = 30000 2023-05-26 11:51:33,465 - INFO - [main-EventThread:ZooKeeperWatcherBase@130] - ZooKeeper client is connected now. 2023-05-26 11:51:33,477 - INFO - [metadata-store-1-1:ZKSessionWatcher@126] - Got ZK session watch event: WatchedEvent state:SyncConnected type:None path:null, sessionId: 305aaad27c20001 2023-05-26 11:51:33,477 - INFO - [metadata-store-1-1:ZKSessionWatcher@170] - ZooKeeper client reconnection with server quorum. Current status: ConnectionLost, sessionId is 305aaad27c20001 ``` ### Anything else? see persistentRecursiveWatches at: https://github.com/apache/zookeeper/pull/1106 ### Are you willing to submit a PR? - [X] I'm willing to submit a PR! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pulsar.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org