zrlw edited a comment on issue #8993: URL: https://github.com/apache/dubbo/issues/8993#issuecomment-939205361
zookeeper not connected连接失败的SingleRegistryCenterDubboProtocolIntegrationTest日志有前一个测试类SingleRegistryCenterInjvmIntegrationTest的zk client session timeout告警, (构建日志: https://github.com/apache/dubbo/runs/3835835450?check_suite_focus=true) 摘了主要的内容如下 ``` [INFO] Running org.apache.dubbo.integration.single.injvm.SingleRegistryCenterInjvmIntegrationTest <== 前一个测试类SingleRegistryCenterInjvmIntegrationTest [08/10/21 07:35:39:199 UTC] Curator-ConnectionStateManager-0 INFO curator.CuratorZookeeperClient: [DUBBO] Curator zookeeper client instance initiated successfully, session id is 100001a38dd0000, dubbo version: 3.0.4-SNAPSHOT, current host: 172.19.112.1 <== 前一个测试类的zk client session ( id: 100001a38dd0000 ) [08/10/21 07:35:39:624 UTC] main INFO support.RegistryManager: [DUBBO] Close all registries [], dubbo version: 3.0.4-SNAPSHOT, current host: 172.19.112.1 <== 前一个测试类关闭所有注册 [08/10/21 07:35:39:624 UTC] main INFO deploy.DefaultApplicationDeployer: [DUBBO] Dubbo Application[243.1] has stopped., dubbo version: 3.0.4-SNAPSHOT, current host: 172.19.112.1 <== 前一个测试类dubbo应用已停止 [08/10/21 07:35:39:632 UTC] main INFO registrycenter.ZookeeperRegistryCenter: [DUBBO] The ZookeeperRegistryCenter close successfully., dubbo version: 3.0.4-SNAPSHOT, current host: 172.19.112.1 <== 前一个测试类关闭zk注册中心 [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.493 s - in org.apache.dubbo.integration.single.injvm.SingleRegistryCenterInjvmIntegrationTest <== 前一个测试类结束 [INFO] Running org.apache.dubbo.integration.single.SingleRegistryCenterDubboProtocolIntegrationTest <== 开始出zk连接失败的测试类SingleRegistryCenterDubboProtocolIntegrationTest [08/10/21 07:35:39:634 UTC] main INFO registrycenter.ZookeeperRegistryCenter: [DUBBO] The ZookeeperRegistryCenter is starting..., dubbo version: 3.0.4-SNAPSHOT, current host: 172.19.112.1 <== 启动zk注册中心 [08/10/21 07:35:39:727 UTC] Curator-ConnectionStateManager-0 WARN curator.CuratorZookeeperClient: [DUBBO] Curator zookeeper connection of session 100001a38dd0000 timed out. connection timeout value is 3000, session expire timeout value is 60000, dubbo version: 3.0.4-SNAPSHOT, current host: 172.19.112.1 <== 提示前一个测试类的session超时 (id: 100001a38dd0000 ) [08/10/21 07:35:40:655 UTC] main INFO registrycenter.ZookeeperRegistryCenter: [DUBBO] The ZookeeperRegistryCenter is started successfully, dubbo version: 3.0.4-SNAPSHOT, current host: 172.19.112.1 ``` 问题: 前一个测试类SingleRegistryCenterInjvmIntegrationTest的tearDown做了DubboBootstrap.reset(),zk客户端应该全都被关闭了,但是debug跟踪发现并没有调用CuratorZookeeperClient的doClose方法。 看了一篇curator连接异常问题定位的帖子,里面说curator的event loop是个死循环处理,依次调用各个watcher,如果有一个watcher挂住hang掉了,后面的事件都不会被处理,换句话如果zk连接的事件排在了hang掉的事件处理后面,那么curator就没有机会处理connected事件将currentConnectionState改为已连接,应用的连接就会超时失败。 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
