Arun Suresh created HDFS-7858:
---------------------------------
Summary: Improve HA Namenode Failover detection on the client
using Zookeeper
Key: HDFS-7858
URL: https://issues.apache.org/jira/browse/HDFS-7858
Project: Hadoop HDFS
Issue Type: Improvement
Reporter: Arun Suresh
Assignee: Arun Suresh
In an HA deployment, Clients are configured with the hostnames of both the
Active and Standby Namenodes.Clients will first try one of the NNs
(non-deterministically) and if its a standby NN, then it will respond to the
client to retry the request on the other Namenode.
If the client happens to talks to the Standby first, and the standby is
undergoing some GC / is busy, then those clients might not get a response soon
enough to try the other NN.
Proposed Approach to solve this :
1) Since Zookeeper is already used as the failover controller, the clients
could talk to ZK and find out which is the active namenode before contacting it.
2) Long-lived DFSClients would have a ZK watch configured which fires when
there is a failover so they do not have to query ZK everytime to find out the
active NN
2) Clients can also cache the last active NN in the user's home directory
(~/.lastNN) so that short-lived clients can try that Namenode first before
querying ZK
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)