[ https://issues.apache.org/jira/browse/HADOOP-1093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12486472 ]
dhruba borthakur commented on HADOOP-1093: ------------------------------------------ I am not too inclined to check if the last block has achieved its minimum replication factor before allocating the next block. The datanodes directly send block confirmation to the namenode and a slow datanode for even one block of the file is going to adversely impact write performance. But I agree with you that we need to slow down the client if the namenode is overloaded? If the addBlock() RPC times out, maybe the client should retry? > NNBench generates millions of NotReplicatedYetException in Namenode log > ----------------------------------------------------------------------- > > Key: HADOOP-1093 > URL: https://issues.apache.org/jira/browse/HADOOP-1093 > Project: Hadoop > Issue Type: Bug > Components: dfs > Affects Versions: 0.12.0 > Reporter: Nigel Daley > Assigned To: dhruba borthakur > Fix For: 0.13.0 > > Attachments: nyr2.patch > > > Running NNBench on latest trunk (0.12.1 candidate) on a few hundred nodes > yielded 2.3 million of these exceptions in the NN log: > 2007-03-08 09:23:03,053 INFO org.apache.hadoop.ipc.Server: IPC Server > handler 0 on 8020 call error: > org.apache.hadoop.dfs.NotReplicatedYetException: Not replicated yet > at > org.apache.hadoop.dfs.FSNamesystem.getAdditionalBlock(FSNamesystem.java:803) > at org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java:309) > at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:336) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:559) > I run NNBench to create files with block size set to 1 and replication set to > 1. NNBench then writes 1 byte to the file. Minimum replication for the > cluster is the default, ie 1. If it encounters an exception while trying to > do either the create or write operations, it loops and tries again. Multiply > this by 1000 files per node and a few hundred nodes. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.