Sumit Mohanty created AMBARI-4270:
-------------------------------------

             Summary: Add decommission support for TaskTracker and modify 
support for DataNode to match
                 Key: AMBARI-4270
                 URL: https://issues.apache.org/jira/browse/AMBARI-4270
             Project: Ambari
          Issue Type: Bug
          Components: controller
    Affects Versions: 1.5.0
            Reporter: Sumit Mohanty
            Assignee: Sumit Mohanty
             Fix For: 1.5.0


*Current implementation:*
Ambari uses the following steps to perform DN decommissioning/recommissioning. 

When a DN is identified for decom/recom, a config entry is created/updated. The 
config-type is hdfs-exclude-file and it contains only one config property - a 
comma separated list of hosts that are decommissioned. So if a new host is 
being decommissioned then an entry is added and an entry is deleted if the DN 
is being recommissioned.

Afterwards, ambari-server is asked to perform decommission based on the above 
config-type. Each change adds a new version of the config-type and the version 
value (tag) is provided as a reference to BE. Ambari BE uses the specified 
version and materializes the exclude file. After the file is created, 
refreshNodes is called.

Decommission happens in the background while the FE remembers the tag of the 
latest decommission config-type and uses that to render which hosts are 
decommissioned.

*Goal*
* Convert to a single API call
* Have BE store the decommission-ness of host components

*Proposal*
Define a flag “admin_state” for slave host components. When this flag is set to 
“DECOMMISSIOEND” then the component is decommissioned otherwise its not.

ClusterHostInfo data structure is modified to add the following new information:
* excluded_datanodes = [2,3,7-10]
* excluded_tasktrackers = [2-5]
* excluded_nodemanagers = [4,7]

The numbers above are indices into the list of hosts that is sent to agent with 
each command. The indices correspond to the hosts that are decommissioned.

The above information is consumed by CONFIGURE and DECOMMISSION commands for 
various MASTER components. The implementation of the DECOMISSION command will 
read the hostnames, create the appropriate exclude file and call 
“-refreshNodes”. CONFIGURE can simply create the exclude file as START after 
CONFIGURE will automatically consume the exclude configuration.

Sample API calls:
{noformat}
curl -u admin:admin -H "X-Requested-By: ambari" -X POST -d 
'{"RequestInfo":{"context":"Decommission 
DataNode","command":"DECOMMISSION","service_name":"HDFS", 
"component_name":"NAMENODE", 
"parameters":{"excluded_hosts":"c6401.ambari.apache.org"}}}' 
http://localhost:8080/api/v1/clusters/c1/requests
{noformat}

The above call decommissione “c6401.ambari.apache.org” which hosts a DataNode.

{noformat}
curl -u admin:admin -H "X-Requested-By: ambari" -X POST -d 
'{"RequestInfo":{"context":"Decommission 
DataNode","command":"DECOMMISSION","service_name":"MAPREDUCE", 
"component_name":"JOBTRACKER", "parameters":{“slave_type”:”TASKTRACKER”, 
"included_hosts":"c6402.ambari.apache.org,c6403.ambari.apache.org"}}}' 
http://localhost:8080/api/v1/clusters/c1/requests
{noformat}

The above call re-commissioned “c6402.ambari.apache.org” and 
“c6403.ambari.apache.org” where TaskTracker components were currently 
decommissioend.





--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to