The only way (in 0.12.3) to decommission is to put the name of the node(s) in the excludes file and then run the "dfsadmin -refreshNodes" command. Is it possible that there isn't enough free space on ur cluster to move all the data from those 10 nodes to the remaining ones? Also, the reason your "decommission process" might not have ended might be because of reasons listed in an existing issue: http://issues.apache.org/jira/browse/HADOOP-1184. This issue will be fixed in the next release.
Thanks, dhruba -----Original Message----- From: Johan Oskarsson [mailto:[EMAIL PROTECTED] Sent: Monday, April 30, 2007 2:53 AM To: [email protected] Subject: Re: Decommission of datanodes I was under the impression that the only way to decomission nodes in version 0.12.3 is to specify the nodes in a file and then point dfs.hosts.exclude to that file. /Johan Timothy Chklovski wrote: > which commands are you issuing to decommission the nodes? > > On 4/29/07, Johan Oskarsson <[EMAIL PROTECTED]> wrote: >> >> Hi. >> >> I'm trying to decommission 10 datanodes of 35 in our cluster. The >> process have been running for a couple of days >> but only one node have finished. Perhaps I should have tried to >> decommission one at the time? >> I was afraid it would lead to unnecessary transfers as the node being >> decommissioned would probably have copied data to other nodes >> that I was going to decommission later. >> >> There's no way of seeing how far the process have come? >> >> The logs contain a lot of these: >> >> 2007-04-29 16:56:56,411 WARN org.apache.hadoop.fs.FSNamesystem: Not able >> to place enough replicas, still in need of 1 >> >> Is that related to the decommission process? >> >> /Johan >> >
