Eugene Chung created HIVE-24713: ----------------------------------- Summary: HS2 never knows the deletion of znode on some cases Key: HIVE-24713 URL: https://issues.apache.org/jira/browse/HIVE-24713 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Eugene Chung Assignee: Eugene Chung Fix For: 4.0.0
While using zookeeper discovery mode, the problem that HS2 never knows deregistering from Zookeeper could always happen. Reproduction is simple. # Find one of the zk servers which holds the DeRegisterWatcher watches of HS2. If ZK server is 3.5.0 or above, it's easily found with [http://zk-server:8080/commands/watches] (ZK AdminServer feature) # Check which HS2 is watching on the ZK server found at 1, say it's _hs2-of-2_ # Restart the ZK server found at 1 # Deregister HS2 with the command {noformat} hive --service hiveserver2 -deregister hs2-of-2{noformat} # _hs2-of-2_ never knows that it must be shut down because the watch event of DeregisterWatcher was already fired at the time of 3. The reason of the problem is explained at [https://zookeeper.apache.org/doc/r3.3.3/zookeeperProgrammers.html#sc_WatchRememberThese] I added logging to DeRegisterWatcher and checked what events were occurred at the time of 3(restarting of ZK server); # WatchedEvent state:Disconnected type:None path:null # WatchedEvent[WatchedEvent state:SyncConnected type:None path:null] # WatchedEvent[WatchedEvent state:SaslAuthenticated type:None path:null] # WatchedEvent[WatchedEvent state:SyncConnected type:NodeDataChanged path:/hiveserver2/serverUri=hs2-of-2:10000;version=3.1.2;sequence=0000000711] As the zk manual says, watchs are one-time triggers. When connection to the ZK server was reestablished, state:SyncConnected type:NodeDataChanged,path:hs2-of-2 was fired and that's all. *DeregisterWatcher must be registered again for the same znode to get a future NodeDeleted event.* -- This message was sent by Atlassian Jira (v8.3.4#803005)