NameNode.create failed 
-----------------------

                 Key: HADOOP-1938
                 URL: https://issues.apache.org/jira/browse/HADOOP-1938
             Project: Hadoop
          Issue Type: Bug
          Components: dfs
    Affects Versions: 0.13.1
            Reporter: Runping Qi



Under heavy load, DFS namenode fails to create file

org.apache.hadoop.ipc.RemoteException: java.io.IOException: Failed to create 
file /xxx/xxx/_task_0001_r_000001_0/part-00001 on client xxx.xxx.xxx.xxx 
because there were not enough datanodes available. Found 0 datanodes but 
MIN_REPLICATION for the cluster is configured to be 1.
        at org.apache.hadoop.dfs.FSNamesystem.startFile(FSNamesystem.java:651)
        at org.apache.hadoop.dfs.NameNode.create(NameNode.java:294)
        at sun.reflect.GeneratedMethodAccessor92.invoke(Unknown Source)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:341)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:573)

The above problem occurred when I ran a well tuned map/reduce program on a hood 
node cluster.
The program is well tuned in the sense that the map output data are evenly 
partitioned among 180 reducers.
The shuffling and sorting was completed at about the same time on all the 
reducers.
The reducers started reduce work at about the same time and were expected to 
produce about the same amount of output (2GB).
This "synchronized" behavior caused  the reducers to try to create output dfs 
files at about the same time.
The namenode seemed to have difficulty to handle that situation, causing the 
reducers waiting on file creation for long period of time.
Eeventually, they failed with the above exception.


 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to