I should have added this in my first email but I do get an error in the data node's log file
'2014-07-12 19:39:58,027 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: BlockReport of 0 blocks got processed in 1 msecs' On Wed, Jul 23, 2014 at 3:18 PM, andrew touchet <[email protected]> wrote: > Hello, > > I am Decommissioning data nodes for an OS upgrade on a HPC cluster . > Currently, users can run jobs that use data stored on /hdfs. They are able > to access all datanodes/compute nodes except the one being decommissioned. > > Is this safe to do? Will edited files affect the decommissioning node? > > I've been adding the nodes to /usr/lib/hadoop-0.20/conf/hosts_exclude and > running 'hadoop dfsadmin -refreshNodes' on the name name node. Then I > simply wait for log files to report completion. After upgrade, I simply > remove the node from hosts_exlude and start hadoop again on the datanode. > > Also: Under the namenode web interface I just noticed that the node I have > decommissioned previously now has 0 Configured capacity, Used, Remaining > memory and is now 100% Used. > > I used the same /etc/sysconfig/hadoop file from before the upgrade, > removed the node from hosts_exclude, and ran '-refreshNodes' afterwards. > > What steps have I missed in the decommissioning process or while bringing > the data node back online? > > > >
