Did anything change with DECOMISSION in the 2.0 release?  The process appears 
to decommission fine (the request completes and says it updated the dfs.exclude 
file), but the datanodes aren't decommissioned and HDFS now says they're dead 
and I need to restart the Namenode.  For YARN, the nodemanagers appear to have 
decommissioned ok and are in decommissioned status, but it says I need to 
restart the resource manager (this didn't used to be the case in 1.7.0).

The only difference is that I don't set maintenance mode on the datanodes until 
after the decommission completes, because that wasn't working for me at one 
point (turns out hitting the API slightly differently would have made it work). 
 Is that the cause maybe?  Is restarting the master services now required after 
a decommission?


Task output:

DataNode Decommission: slave-2.local,slave-4.local

stderr:
None
 stdout:
2015-05-14 14:45:48,439 - u"File['/etc/hadoop/conf/dfs.exclude']" {'owner': 
'hdfs', 'content': Template('exclude_hosts_list.j2'), 'group': 'hadoop'}
2015-05-14 14:45:48,670 - Writing u"File['/etc/hadoop/conf/dfs.exclude']" 
because contents don't match
2015-05-14 14:45:48,864 - u"Execute['']" {'user': 'hdfs'}
2015-05-14 14:45:48,968 - u"ExecuteHadoop['dfsadmin -refreshNodes']" 
{'bin_dir': '/usr/hdp/current/hadoop-client/bin', 'conf_dir': 
'/etc/hadoop/conf', 'kinit_override': True, 'user': 'hdfs'}
2015-05-14 14:45:49,011 - u"Execute['hadoop --config /etc/hadoop/conf dfsadmin 
-refreshNodes']" {'logoutput': None, 'try_sleep': 0, 'environment': {}, 
'tries': 1, 'user': 'hdfs', 'path': ['/usr/hdp/current/hadoop-client/bin']}

DataNodes Status 3 live / 2 dead / 0 decommissioning

NodeManager Decommission: slave-2.local,slave-4.local

stderr:
None
 stdout:
2015-05-14 14:47:16,491 - u"File['/etc/hadoop/conf/yarn.exclude']" {'owner': 
'yarn', 'content': Template('exclude_hosts_list.j2'), 'group': 'hadoop'}
2015-05-14 14:47:16,866 - Writing u"File['/etc/hadoop/conf/yarn.exclude']" 
because contents don't match
2015-05-14 14:47:17,057 - u"Execute[' yarn --config /etc/hadoop/conf rmadmin 
-refreshNodes']" {'environment': {'PATH': 
'/usr/sbin:/sbin:/usr/lib/ambari-server/*:/sbin:/usr/sbin:/bin:/usr/bin:/var/lib/ambari-agent:/usr/hdp/current/hadoop-client/bin:/usr/hdp/current/hadoop-yarn-resourcemanager/bin'},
 'user': 'yarn'}

NodeManagers Status 3 active / 0 lost / 0 unhealthy / 0 rebooted / 2 
decommissioned

Reply via email to