I opened a jira for tracking this issue: https://issues.apache.org/jira/browse/HDFS-5046
2013/7/2 sam liu <[email protected]> > Yes, the default replication factor is 3. However, in my case, it's > strange: during decommission hangs, I found some block's expected replicas > is 3, but the 'dfs.replication' value in hdfs-site.xml of every cluster > node is always 2 from the beginning of cluster setup. Below is my steps: > > 1. Install a Hadoop 1.1.1 cluster, with 2 datanodes: dn1 and dn2. And, in > hdfs-site.xml, set the 'dfs.replication' to 2 > 2. Add node dn3 into the cluster as a new datanode, and did not change the > 'dfs.replication' value in hdfs-site.xml and keep it as 2 > note: step 2 passed > 3. Decommission dn3 from the cluster > Expected result: dn3 could be decommissioned successfully > Actual result: > a). decommission progress hangs and the status always be 'Waiting DataNode > status: Decommissioned'. But, if I execute 'hadoop dfs -setrep -R 2 /', the > decommission continues and will be completed finally. > b). However, if the initial cluster includes >= 3 datanodes, this issue > won't be encountered when add/remove another datanode. For example, if I > setup a cluster with 3 datanodes, and then I can successfully add the 4th > datanode into it, and then also can successfully remove the 4th datanode > from the cluster. > > I doubt it's a bug and plan to open a jira to Hadoop HDFS for this. Any > comments? > > Thanks! > > > 2013/6/21 Harsh J <[email protected]> > >> The dfs.replication is a per-file parameter. If you have a client that >> does not use the supplied configs, then its default replication is 3 >> and all files it will create (as part of the app or via a job config) >> will be with replication factor 3. >> >> You can do an -lsr to find all files and filter which ones have been >> created with a factor of 3 (versus expected config of 2). >> >> On Fri, Jun 21, 2013 at 3:13 PM, sam liu <[email protected]> wrote: >> > Hi George, >> > >> > Actually, in my hdfs-site.xml, I always set 'dfs.replication'to 2. But >> still >> > encounter this issue. >> > >> > Thanks! >> > >> > >> > 2013/6/21 George Kousiouris <[email protected]> >> >> >> >> >> >> Hi, >> >> >> >> I think i have faced this before, the problem is that you have the rep >> >> factor=3 so it seems to hang because it needs 3 nodes to achieve the >> factor >> >> (replicas are not created on the same node). If you set the replication >> >> factor=2 i think you will not have this issue. So in general you must >> make >> >> sure that the rep factor is <= to the available datanodes. >> >> >> >> BR, >> >> George >> >> >> >> >> >> On 6/21/2013 12:29 PM, sam liu wrote: >> >> >> >> Hi, >> >> >> >> I encountered an issue which hangs the decommission operatoin. Its >> steps: >> >> 1. Install a Hadoop 1.1.1 cluster, with 2 datanodes: dn1 and dn2. And, >> in >> >> hdfs-site.xml, set the 'dfs.replication' to 2 >> >> 2. Add node dn3 into the cluster as a new datanode, and did not change >> the >> >> 'dfs.replication' value in hdfs-site.xml and keep it as 2 >> >> note: step 2 passed >> >> 3. Decommission dn3 from the cluster >> >> >> >> Expected result: dn3 could be decommissioned successfully >> >> >> >> Actual result: decommission progress hangs and the status always be >> >> 'Waiting DataNode status: Decommissioned' >> >> >> >> However, if the initial cluster includes >= 3 datanodes, this issue >> won't >> >> be encountered when add/remove another datanode. >> >> >> >> Also, after step 2, I noticed that some block's expected replicas is 3, >> >> but the 'dfs.replication' value in hdfs-site.xml is always 2! >> >> >> >> Could anyone pls help provide some triages? >> >> >> >> Thanks in advance! >> >> >> >> >> >> >> >> -- >> >> --------------------------- >> >> >> >> George Kousiouris, PhD >> >> Electrical and Computer Engineer >> >> Division of Communications, >> >> Electronics and Information Engineering >> >> School of Electrical and Computer Engineering >> >> Tel: +30 210 772 2546 >> >> Mobile: +30 6939354121 >> >> Fax: +30 210 772 2569 >> >> Email: [email protected] >> >> Site: http://users.ntua.gr/gkousiou/ >> >> >> >> National Technical University of Athens >> >> 9 Heroon Polytechniou str., 157 73 Zografou, Athens, Greece >> > >> > >> >> >> >> -- >> Harsh J >> > >
