[jira] Commented: (HDFS-599) Improve Namenode robustness by prioritizing datanode heartbeats over client requests

Konstantin Boudnik (JIRA) Wed, 09 Sep 2009 12:03:24 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12753214#action_12753214
 ]


Konstantin Boudnik commented on HDFS-599:
-----------------------------------------

bq. Why would that be? ...the default setting would still be to use one single 
port for DatanodeProtocol and ClientProtocol, so it should not affect 
(security) for preliminary users
Can the default configuration be changed by advanced users opening the second 
port? Perhaps. And because the single port is considered as the default 
configuration I can see how only the default one gets tested and the 
non-default one won't. 

bq. The extra port is visible only to machines only from all cluster nodes and 
not from outside.  It is still behind any firewall that you might have
Will it be always like that? Would it be possible in a future to have some 
cluster nodes outside of one's firewall?
As soon as that happens a second port would have to be opened on a firewall 
consequently widening an attack vector.

Again, it isn't about what current patch does or does not introduce. It is 
about an increment in number of gates from an attack can't be mounted. That's 
why I've referred to the 'surface' - it provides more chances for something to 
be forgotten, insufficiently tested, etc.

> Improve Namenode robustness by prioritizing datanode heartbeats over client 
> requests
> ------------------------------------------------------------------------------------
>
>                 Key: HDFS-599
>                 URL: https://issues.apache.org/jira/browse/HDFS-599
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: name-node
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>
> The namenode processes RPC requests from clients that are reading/writing to 
> files as well as heartbeats/block reports from datanodes.
> Sometime, because of various reasons (Java GC runs, inconsistent performance 
> of NFS filer that stores HDFS transacttion logs, etc), the namenode 
> encounters transient slowness. For example, if the device that stores the 
> HDFS transaction logs becomes sluggish, the Namenode's ability to process 
> RPCs slows down to a certain extent. During this time, the RPCs from clients 
> as well as the RPCs from datanodes suffer in similar fashion. If the 
> underlying problem becomes worse, the NN's ability to process a heartbeat 
> from a DN is severly impacted, thus causing the NN to declare that the DN is 
> dead. Then the NN starts replicating blocks that used to reside on the 
> now-declared-dead datanode. This adds extra load to the NN. Then the 
> now-declared-datanode finally re-establishes contact with the NN, and sends a 
> block report. The block report processing on the NN is another heavyweight 
> activity, thus casing more load to the already overloaded namenode. 
> My proposal is tha the NN should try its best to continue processing RPCs 
> from datanodes and give lesser priority to serving client requests. The 
> Datanode RPCs are integral to the consistency and performance of the Hadoop 
> file system, and it is better to protect it at all costs. This will ensure 
> that NN  recovers from the hiccup much faster than what it does now.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-599) Improve Namenode robustness by prioritizing datanode heartbeats over client requests

Reply via email to