[ 
https://issues.apache.org/jira/browse/HADOOP-3810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12699316#action_12699316
 ] 

Hairong Kuang commented on HADOOP-3810:
---------------------------------------

I tested the patch on a cluster of 61 nodes. One node ran NameNode, and each of 
the other 60 nodes for ran a datanode cluster, each of which ran 50 simulated 
datanodes.
1.      Apply the patch on HADOOP-5556 to the trunk.
2.      Install hadoop.
3.      Configure hadoop: set dfs.block.size to be 10 and 
dfs.datanode.simulateddatastorage.capacity to be 10. 
4.      Create an edit log in the pre-configured name directory. The edit log 
with 3000 files, each of which has one block with a replication factor of 3.
Bin/hadoop org.apache.hadoop.hdfs.server.namenode.CreateEditsLog -f 3000 0 1 -r 
3 -d dfs_name_dir
5.      Start NameNode using the created edit log
6.      Start DataNodeCluster on each of the 60 nodes with the following 
parameter: -n 50 -simulated -r 1 -inject $startBlock 1. $startBlock should be 
0, 50, 100, ... on each node.

Now the cluster is up. The cluster is full but all 3000 blocks are 
under-replicated. With the patch, you can type any dfs command and get the 
response back immediately. Without the patch, shell commands hang.

> NameNode seems unstable on a cluster with little space left
> -----------------------------------------------------------
>
>                 Key: HADOOP-3810
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3810
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.17.1
>            Reporter: Raghu Angadi
>            Assignee: Hairong Kuang
>             Fix For: 0.20.0
>
>         Attachments: globalLock.patch, globalLock1.patch, simon-namenode.PNG
>
>
> NameNode seems not very responsive and unstable when the cluster has very 
> little space left. The clients timeout. The main problem is that it is not 
> clear to the user what is going on. Once I have more details about a NameNode 
> that was in this state, I will fill in here.
> If there is not enough space left on a cluster, it is ok for clients to 
> receive something like "DiskOutOfSpace" exception. 
> Right now it looks like NameNode tries too hard find a node with any space 
> left and ends up being slow to respond to clients. If the CPU taken by 
> chooseTarger() is the main cause, there are two possible fixes :
> # chooseTarget() iterates and takes quite a bit of CPU for allocating 
> datanodes. Usually this not much of a problem. It takes even more cpu when it 
> needs to search multiple racks for a datanode. We could probably reduce some 
> CPU for these searches. The benefit should be measurable.
> # Once NameNode can not find any datanode that has space on a rack, it could 
> mark the rack as "full" and skip searching the rack for next one minute or 
> so. This flag gets cleared after a minute or if any new node is added to the 
> rack.
> #* Of course, this might not be optimal w.r.t disk space usage.. but only for 
> a short duration. Once a cluster is mostly full, the user does expect errors.
> #* On the flip side, this fix does not require extremely CPU optimized 
> version of chooseTarget(). 
> #* I think it is reasonable for NameNode to throw DiskOutOfSpace exception, 
> even though it could have found space if it searched much more extensively.
> ---
> edit : minor changes
>  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to