[ 
https://issues.apache.org/jira/browse/AMBARI-21868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Chekanskiy updated AMBARI-21868:
---------------------------------------
    Description: 
*Problem Statement*:
Allow user to bring back a host/VM that is dead with the same hostname and 
allow an action to recover the state of the host to what it was before the 
incident

*Solution*:
- User gets the host/VM back with same hostname (different IP address is ok)
- The UI state would be INSTALLED for all host components as soon as the host 
starts heart beating
- User will initiate a Recover user action that will do the following (batch 
operation):
-- Move the state of the components back to INIT (allow this state transition)
{code}
curl -H "X-Requested-By:ambari" -u admin:admin -i -X PUT -d 
{"RequestInfo":{"context":"Reset All Host 
Components","operation_level":{"level":"HOST","cluster_name":"c1","host_names":"c6401.ambari.apahe.org"},"query":"HostRoles/component_name.in(DATANODE,NODEMANAGER,METRICS_MONITOR)"},"Body":{"HostRoles":{"state":"INIT"}}}
 
http://c6401.ambari.apache.org:8080/api/v1/clusters/c1/hosts/c6401.ambari.apache.org/host_components
{code}
-- KERBEROS only steps:
--- Issue an INSTALL operation on KERBEROS_CLIENT hostcomponent
--- Issue PUT request
{code}
PUT 
http://server:8080/api/v1/clusters/cl1?regenerate_keytabs=all&regenerate_hosts=recover_hostname1,recover_hostname2,...&ignore_config_updates=true

{"Clusters":{"security_type":"KERBEROS"}}
{code}
-- Issue an INSTALL operation on all hostcomponents (Same call as above with 
state = "INSTALLED")

  was:
*Problem Statement*:
Allow user to bring back a host/VM that is dead with the same hostname and 
allow an action to recover the state of the host to what it was before the 
incident

*Solution*:
- User gets the host/VM back with same hostname (different IP address is ok)
- The UI state would be INSTALLED for all host components as soon as the host 
starts heart beating
- User will initiate a Recover user action that will do the following (batch 
operation):
-- Move the state of the components back to INIT (allow this state transition)
{code}
curl -H "X-Requested-By:ambari" -u admin:admin -i -X PUT -d 
{"RequestInfo":{"context":"Reset All Host 
Components","operation_level":{"level":"HOST","cluster_name":"c1","host_names":"c6401.ambari.apahe.org"},"query":"HostRoles/component_name.in(DATANODE,NODEMANAGER,METRICS_MONITOR)"},"Body":{"HostRoles":{"state":"INIT"}}}
 
http://c6401.ambari.apache.org:8080/api/v1/clusters/c1/hosts/c6401.ambari.apache.org/host_components
{code}
-- Issue an INSTALL operation on all hostcomponents (Same call as above with 
state = "INSTALLED")
-- Issue a START operation on all hostcomponents (Same call as above with state 
= "STARTED")

*Note*: The above operation is expected to recover the security state of the 
host.


> Implement host recovery - backend changes
> -----------------------------------------
>
>                 Key: AMBARI-21868
>                 URL: https://issues.apache.org/jira/browse/AMBARI-21868
>             Project: Ambari
>          Issue Type: Task
>          Components: ambari-server
>    Affects Versions: 2.6.0
>            Reporter: Dmytro Sen
>            Assignee: Eugene Chekanskiy
>            Priority: Blocker
>             Fix For: 2.6.0
>
>         Attachments: AMBARI-21868_2.patch, AMBARI-21868.patch, 
> AMBARI-21868.patch
>
>
> *Problem Statement*:
> Allow user to bring back a host/VM that is dead with the same hostname and 
> allow an action to recover the state of the host to what it was before the 
> incident
> *Solution*:
> - User gets the host/VM back with same hostname (different IP address is ok)
> - The UI state would be INSTALLED for all host components as soon as the host 
> starts heart beating
> - User will initiate a Recover user action that will do the following (batch 
> operation):
> -- Move the state of the components back to INIT (allow this state transition)
> {code}
> curl -H "X-Requested-By:ambari" -u admin:admin -i -X PUT -d 
> {"RequestInfo":{"context":"Reset All Host 
> Components","operation_level":{"level":"HOST","cluster_name":"c1","host_names":"c6401.ambari.apahe.org"},"query":"HostRoles/component_name.in(DATANODE,NODEMANAGER,METRICS_MONITOR)"},"Body":{"HostRoles":{"state":"INIT"}}}
>  
> http://c6401.ambari.apache.org:8080/api/v1/clusters/c1/hosts/c6401.ambari.apache.org/host_components
> {code}
> -- KERBEROS only steps:
> --- Issue an INSTALL operation on KERBEROS_CLIENT hostcomponent
> --- Issue PUT request
> {code}
> PUT 
> http://server:8080/api/v1/clusters/cl1?regenerate_keytabs=all&regenerate_hosts=recover_hostname1,recover_hostname2,...&ignore_config_updates=true
> {"Clusters":{"security_type":"KERBEROS"}}
> {code}
> -- Issue an INSTALL operation on all hostcomponents (Same call as above with 
> state = "INSTALLED")



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to