Amit Anand created YARN-3947: -------------------------------- Summary: Add support for short host names in yarn decommissioning process Key: YARN-3947 URL: https://issues.apache.org/jira/browse/YARN-3947 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Affects Versions: 2.6.0 Reporter: Amit Anand Priority: Minor
When running {yarn decommissioning} the {yarn rmadmin -refreshNodes} doesn't like short host names for the nodes to be decommissioned in {yarn.exclude} file. It requires {FQDN} for the host name to be present to be able to successfully decommission a node. The decommissioning behavior in {HDFS} is different as it can take short host names. Below are the details of what I am seeing: My {yarn.exlcude} has short name for the host name: bcpc-vm1 Running: {code} sudo -u yarn yarn rmadmin -refreshNodes {code} shows following entries in the log file: {code} 2015-07-21 11:14:18,795 INFO org.apache.hadoop.conf.Configuration: found resource yarn-site.xml at file:/etc/hadoop/conf/yarn-site.xml 2015-07-21 11:14:18,802 INFO org.apache.hadoop.util.HostsFileReader: Setting the includes file to 2015-07-21 11:14:18,802 INFO org.apache.hadoop.util.HostsFileReader: Setting the excludes file to /etc/hadoop/conf/yarn.exclude 2015-07-21 11:14:18,803 INFO org.apache.hadoop.util.HostsFileReader: Refreshing hosts (include/exclude) list 2015-07-21 11:14:18,803 INFO org.apache.hadoop.util.HostsFileReader: Adding bcpc-vm1 to the list of excluded hosts from /etc/hadoop/conf/yarn.exclude 2015-07-21 11:14:18,803 INFO org.apache.hadoop.util.HostsFileReader: Adding bcpc-vm1 to the list of excluded hosts from /etc/hadoop/conf/yarn.exclude 2015-07-21 11:14:18,803 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=yarn IP=10.0.100.12 OPERATION=refreshNodes TARGET=AdminService RESULT=SUCCESS {code} And the node is not decommissioned. When I add the {FQDN} for the host name the decommissioning works successfully and I see following in the RM logs: {code} 2015-07-21 11:14:43,453 INFO org.apache.hadoop.conf.Configuration: found resource yarn-site.xml at file:/etc/hadoop/conf.LAB-A/yarn-site.xml 2015-07-21 11:14:43,456 INFO org.apache.hadoop.util.HostsFileReader: Setting the includes file to 2015-07-21 11:14:43,456 INFO org.apache.hadoop.util.HostsFileReader: Setting the excludes file to /etc/hadoop/conf/yarn.exclude 2015-07-21 11:14:43,456 INFO org.apache.hadoop.util.HostsFileReader: Refreshing hosts (include/exclude) list 2015-07-21 11:14:43,456 INFO org.apache.hadoop.util.HostsFileReader: Adding bcpc-vm1.example.com to the list of excluded hosts from /etc/hadoop/conf/yarn.exclude 2015-07-21 11:14:43,456 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=yarn IP=10.100.0.11 OPERATION=refreshNodes TARGET=AdminService RESULT=SUCCESS 2015-07-21 11:14:44,198 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService: Disallowed NodeManager nodeId: bcpc-vm1.example.com:35197 hostname: bcpc-vm1.example.com:35197 2015-07-21 11:14:44,198 INFO org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Deactivating Node bcpc-vm1.example.com:35197 as it is now DECOMMISSIONED 2015-07-21 11:14:44,199 INFO org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: bcpc-vm1.example.com:35197 Node Transitioned from RUNNING to DECOMMISSIONED 2015-07-21 11:14:44,199 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: Removed node bcpc-vm1.example.com:35197 cluster capacity: <memory:618723, vCores:96> {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)