[ https://issues.apache.org/jira/browse/HDFS-599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12868299#action_12868299 ]
dhruba borthakur commented on HDFS-599: --------------------------------------- > You should break the current ClientProtocol into AdminProtocol and the real > ClientProtocol We can certainly do this. The ClientProtocol essentially consists methods that fall into two categories: 1. Calls that modify file system metadata in the namenode 2. Calls that retrieve portions of file system metadata from the namenode. Let's consider the case when the NN is restarting and is in safemode. Only the servicePort is open at this time. The calls in category 1 will anyway fail because namenode is in safemode. That leaves us with only the calls in category 2. When the namenode is in safemode, the admin would still want the ability to be able to list files (dfs -lsr) , get status on files (dfs -ls), see the amount of space used by a portion of the namespace (dfs -count), validate block size of an existing file(s), look at target of a symlink. That means that admin would want to invoke most of the calls in category 2, isn't it? If you agree with the above, then it is not very beneficial to break up ClientProtocol into two parts, because both the parts would have to be available on the service port? There are quite a few things that we can do to handle "a mis-configured client happens to choose the service port as its client port". if we do not even list the service port in the client's config, that would be a good thing.... can we start there? > Improve Namenode robustness by prioritizing datanode heartbeats over client > requests > ------------------------------------------------------------------------------------ > > Key: HDFS-599 > URL: https://issues.apache.org/jira/browse/HDFS-599 > Project: Hadoop HDFS > Issue Type: Improvement > Components: name-node > Reporter: dhruba borthakur > Assignee: Dmytro Molkov > Attachments: HDFS-599.patch > > > The namenode processes RPC requests from clients that are reading/writing to > files as well as heartbeats/block reports from datanodes. > Sometime, because of various reasons (Java GC runs, inconsistent performance > of NFS filer that stores HDFS transacttion logs, etc), the namenode > encounters transient slowness. For example, if the device that stores the > HDFS transaction logs becomes sluggish, the Namenode's ability to process > RPCs slows down to a certain extent. During this time, the RPCs from clients > as well as the RPCs from datanodes suffer in similar fashion. If the > underlying problem becomes worse, the NN's ability to process a heartbeat > from a DN is severly impacted, thus causing the NN to declare that the DN is > dead. Then the NN starts replicating blocks that used to reside on the > now-declared-dead datanode. This adds extra load to the NN. Then the > now-declared-datanode finally re-establishes contact with the NN, and sends a > block report. The block report processing on the NN is another heavyweight > activity, thus casing more load to the already overloaded namenode. > My proposal is tha the NN should try its best to continue processing RPCs > from datanodes and give lesser priority to serving client requests. The > Datanode RPCs are integral to the consistency and performance of the Hadoop > file system, and it is better to protect it at all costs. This will ensure > that NN recovers from the hiccup much faster than what it does now. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.