[ 
https://issues.apache.org/jira/browse/HDFS-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13595631#comment-13595631
 ] 

Amareshwari Sriramadasu commented on HDFS-4562:
-----------------------------------------------

Here is log for /user/mapred/system/job_201303040902_85451/jobToken :

{noformat}
grep '/user/mapred/system/job_201303040902_85451/jobToken' namenode.log
2013-03-06 00:02:21,787 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.allocateBlock: /user/mapred/system/job_201303040902_85451/jobToken. 
blk_6341538266279390032_231558736
2013-03-06 00:02:21,795 INFO org.apache.hadoop.hdfs.StateChange: Removing lease 
on  file /user/mapred/system/job_201303040902_85451/jobToken from client 
DFSClient_201543923
2013-03-06 00:02:21,795 INFO org.apache.hadoop.hdfs.StateChange: DIR* 
NameSystem.completeFile: file 
/user/mapred/system/job_201303040902_85451/jobToken is closed by 
DFSClient_201543923
{noformat}

Here is the configuration wrt replication:
{noformat}
      <name>dfs.replication.max</name>
      <value>50</value>

      <name>dfs.replication</name>
      <value>3</value>

      <name>mapred.submit.replication</name>
      <value>3</value>
{noformat}

bq. If I were to guess, one of the datanode in the pipeline reports a replica 
for a block late. The replication monitor is too aggressive and creates an 
additional replica meanwhile. However this should not happen for every block 
that is created.
I also agree this is not happening for every block. But we are seeing this 
number so huge in our cluster. Also we have many files short lived in our 
cluster, which cause two consecutive delete requests to a data node, then 
hitting HDFS-4544. 
                
> Many excess replicas getting created
> ------------------------------------
>
>                 Key: HDFS-4562
>                 URL: https://issues.apache.org/jira/browse/HDFS-4562
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 1.1.1
>            Reporter: Amareshwari Sriramadasu
>             Fix For: 1.2.0
>
>
> We are seeing too many excess replicas getting created in our cluster. The 
> number excess replicas in day coming out to be more than 1 lakh.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to