[
https://issues.apache.org/jira/browse/HIVE-24713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Eugene Chung updated HIVE-24713:
--------------------------------
Description:
While using zookeeper discovery mode, the problem that HS2 never knows
deregistering from Zookeeper could always happen.
Reproduction is simple.
# Find one of the zk servers which holds the DeRegisterWatcher watches of HS2
instances. If the version of ZK server is 3.5.0 or above, it's easily found
with [http://zk-server:8080/commands/watches] (ZK AdminServer feature)
# Check which HS2 instance is watching on the ZK server found at 1, say it's
_hs2-of-2_
# Restart the ZK server found at 1
# Deregister _hs2-of-2_ with the command
{noformat}
hive --service hiveserver2 -deregister hs2-of-2{noformat}
# _hs2-of-2_ never knows that it must be shut down because the watch event of
DeregisterWatcher was already fired at the time of 3.
The reason of the problem is explained at
[https://zookeeper.apache.org/doc/r3.3.3/zookeeperProgrammers.html#sc_WatchRememberThese]
I added some logging to DeRegisterWatcher and checked what events were occurred
at the time of 3(restarting of ZK server);
# WatchedEvent state:Disconnected type:None path:null
# WatchedEvent[WatchedEvent state:SyncConnected type:None path:null]
# WatchedEvent[WatchedEvent state:SaslAuthenticated type:None path:null]
# WatchedEvent[WatchedEvent state:SyncConnected type:NodeDataChanged
path:/hiveserver2/serverUri=hs2-of-2:10000;version=3.1.2;sequence=0000000711]
As the zk manual says, watches are one-time triggers. When the connection to
the ZK server was reestablished, state:SyncConnected
type:NodeDataChanged,path:hs2-of-2 was fired and it's the end.
*DeregisterWatcher must be registered again for the same znode to get a future
NodeDeleted event.*
was:
While using zookeeper discovery mode, the problem that HS2 never knows
deregistering from Zookeeper could always happen.
Reproduction is simple.
# Find one of the zk servers which holds the DeRegisterWatcher watches of HS2.
If ZK server is 3.5.0 or above, it's easily found with
[http://zk-server:8080/commands/watches] (ZK AdminServer feature)
# Check which HS2 is watching on the ZK server found at 1, say it's _hs2-of-2_
# Restart the ZK server found at 1
# Deregister HS2 with the command
{noformat}
hive --service hiveserver2 -deregister hs2-of-2{noformat}
# _hs2-of-2_ never knows that it must be shut down because the watch event of
DeregisterWatcher was already fired at the time of 3.
The reason of the problem is explained at
[https://zookeeper.apache.org/doc/r3.3.3/zookeeperProgrammers.html#sc_WatchRememberThese]
I added logging to DeRegisterWatcher and checked what events were occurred at
the time of 3(restarting of ZK server);
# WatchedEvent state:Disconnected type:None path:null
# WatchedEvent[WatchedEvent state:SyncConnected type:None path:null]
# WatchedEvent[WatchedEvent state:SaslAuthenticated type:None path:null]
# WatchedEvent[WatchedEvent state:SyncConnected type:NodeDataChanged
path:/hiveserver2/serverUri=hs2-of-2:10000;version=3.1.2;sequence=0000000711]
As the zk manual says, watchs are one-time triggers. When connection to the ZK
server was reestablished, state:SyncConnected
type:NodeDataChanged,path:hs2-of-2 was fired and that's all. *DeregisterWatcher
must be registered again for the same znode to get a future NodeDeleted event.*
> HS2 never knows the deletion of znode in the particular case
> ------------------------------------------------------------
>
> Key: HIVE-24713
> URL: https://issues.apache.org/jira/browse/HIVE-24713
> Project: Hive
> Issue Type: Bug
> Components: HiveServer2
> Reporter: Eugene Chung
> Assignee: Eugene Chung
> Priority: Major
> Fix For: 4.0.0
>
>
> While using zookeeper discovery mode, the problem that HS2 never knows
> deregistering from Zookeeper could always happen.
> Reproduction is simple.
> # Find one of the zk servers which holds the DeRegisterWatcher watches of
> HS2 instances. If the version of ZK server is 3.5.0 or above, it's easily
> found with [http://zk-server:8080/commands/watches] (ZK AdminServer feature)
> # Check which HS2 instance is watching on the ZK server found at 1, say it's
> _hs2-of-2_
> # Restart the ZK server found at 1
> # Deregister _hs2-of-2_ with the command
> {noformat}
> hive --service hiveserver2 -deregister hs2-of-2{noformat}
> # _hs2-of-2_ never knows that it must be shut down because the watch event
> of DeregisterWatcher was already fired at the time of 3.
> The reason of the problem is explained at
> [https://zookeeper.apache.org/doc/r3.3.3/zookeeperProgrammers.html#sc_WatchRememberThese]
> I added some logging to DeRegisterWatcher and checked what events were
> occurred at the time of 3(restarting of ZK server);
> # WatchedEvent state:Disconnected type:None path:null
> # WatchedEvent[WatchedEvent state:SyncConnected type:None path:null]
> # WatchedEvent[WatchedEvent state:SaslAuthenticated type:None path:null]
> # WatchedEvent[WatchedEvent state:SyncConnected type:NodeDataChanged
> path:/hiveserver2/serverUri=hs2-of-2:10000;version=3.1.2;sequence=0000000711]
> As the zk manual says, watches are one-time triggers. When the connection to
> the ZK server was reestablished, state:SyncConnected
> type:NodeDataChanged,path:hs2-of-2 was fired and it's the end.
> *DeregisterWatcher must be registered again for the same znode to get a
> future NodeDeleted event.*
--
This message was sent by Atlassian Jira
(v8.3.4#803005)