[ 
https://issues.apache.org/jira/browse/AMBARI-25672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

h.s updated AMBARI-25672:
-------------------------
    Summary: delete a host from a kerberos cluster not completely clear all 
components kerberos identies in database and kdc  (was: delete host from a 
kerberos cluster not completely clear all components kerberos identies in 
database and kdc)

> delete a host from a kerberos cluster not completely clear all components 
> kerberos identies in database and kdc
> ---------------------------------------------------------------------------------------------------------------
>
>                 Key: AMBARI-25672
>                 URL: https://issues.apache.org/jira/browse/AMBARI-25672
>             Project: Ambari
>          Issue Type: Bug
>          Components: ambari-server
>    Affects Versions: 2.7.3
>            Reporter: h.s
>            Priority: Major
>
> step 1:
>  # delete a host from a kerberos cluster ,not a master host
>  # stop all the service on the host,
>  # use api delete host  
> step 2:
>  # prepare a host, install agent
>  # add a node to the cluster use api and install service
>  # regenerate_keytab
>  # ambari hang at preparing operations/hostname/preparing operations
> it is because step1.3 cannot completely clear all  this host kerberos 
> idetities in both database(mysql ) and kdc(kdc.admin) 
>  * in mysql
>           there are 3 table kkp_mapping_service, kerberos_keytab_principal, 
> kerberos_keytab,kerberos_principal, host related kerberos identities in these 
> tables must be deleted completely,
>  * in kdc , 
> {code:java}
> kadmin.local
> listprincs *hostnanme*{code}
> will find related identies not deleted completely
> some services kerberos identies in mysql and kdc can be deleted but some 
> sevices not,
> if not all service kerberos identies deleted completely,if any service 
> kerberos identities left ,next time add a host to this cluster, will hang at 
> preparing operations
>  
> delete host api call chain in ambari-server
> {code:java}
> org.apache.ambari.server.api.services.HostService#deleteHost
> org.apache.ambari.server.api.services.BaseService#handleRequest
> org.apache.ambari.server.api.services.BaseRequest#process
> org.apache.ambari.server.api.handlers.BaseManagementHandler#handleRequest
> org.apache.ambari.server.api.handlers.DeleteHandler#persist
> org.apache.ambari.server.api.services.persistence.PersistenceManagerImpl#delete
> org.apache.ambari.server.controller.internal.ClusterControllerImpl#deleteResources
> org.apache.ambari.server.controller.internal.AbstractAuthorizedResourceProvider#deleteResources
> org.apache.ambari.server.controller.internal.HostResourceProvider#deleteResourcesAuthorized
> org.apache.ambari.server.controller.internal.HostResourceProvider#deleteHosts
> A=org.apache.ambari.server.controller.internal.HostResourceProvider#processDeleteHostRequests
> {code}
>  
> A=org.apache.ambari.server.controller.internal.HostResourceProvider#processDeleteHostRequests
>  has some main step
> {code:java}
> A1=org.apache.ambari.server.controller.AmbariManagementControllerImpl#deleteHostComponents
>   //this step will delete components and their kerbers identities
> A2=org.apache.ambari.server.state.cluster.ClustersImpl#deleteHost //this step 
> will delete host from mysql{code}
>  
>  
> A1=org.apache.ambari.server.controller.AmbariManagementControllerImpl#deleteHostComponents
>  call chain
> {code:java}
> org.apache.ambari.server.state.ServiceComponentImpl#deleteServiceComponentHosts
> A1-1=org.apache.ambari.server.state.svccomphost.ServiceComponentHostImpl#delete
> {code}
> A1-1=org.apache.ambari.server.state.svccomphost.ServiceComponentHostImpl#delete
>  call chain
> {code:java}
> org.apache.ambari.server.state.cluster.ClusterImpl#removeServiceComponentHost
> A1-1-1=eventPublisher.publish(event);  //publish 
> ServiceComponentUninstalledEvent,org.apache.ambari.server.controller.utilities.KerberosIdentityCleaner#componentRemoved
>  will deal this event,and delete components kerberos identites,these event 
> once publish,next line code will execute,not wait these event finish,
> {code}
> A1-1-1=org.apache.ambari.server.controller.utilities.KerberosIdentityCleaner#componentRemoved
>  call chain
> {code:java}
> org.apache.ambari.server.controller.utilities.RemovableIdentities#remove
> org.apache.ambari.server.controller.KerberosHelperImpl#deleteIdentities(org.apache.ambari.server.state.Cluster,
>  java.util.List<org.apache.ambari.server.serveraction.kerberos.Component>, 
> java.util.Set<java.lang.String>)
> org.apache.ambari.server.controller.KerberosHelperImpl#validateKDCCredentials(org.apache.ambari.server.controller.KerberosDetails,
>  org.apache.ambari.server.state.Cluster) //check KDC administrator credentials
> A1-1-1-1=org.apache.ambari.server.controller.DeleteIdentityHandler#addDeleteIdentityStages
>  //add stage in prepare delete identies
> {code}
> A1-1-1-1=org.apache.ambari.server.controller.DeleteIdentityHandler#addDeleteIdentityStages
>  call chain
> {code:java}
> if (manageIdentities) {
>   addPrepareDeleteIdentity(cluster, hostParamsJson, event, commandParameters, 
> stageContainer);
>   addDeleteKeytab(cluster, commandParameters.getAffectedHostNames(), 
> hostParamsJson, commandParameters, stageContainer);
>   addDestroyPrincipals(cluster, hostParamsJson, event, commandParameters, 
> stageContainer);
> }
> org.apache.ambari.server.controller.DeleteIdentityHandler#addDeleteKeytab 
> //check hostexists to decide whether create this stage,in order to delete 
> component kerberos identities, this stage should not be created,that is to 
> say,host is exist judgement should be false,because A2 has delete this host 
> from mysql
> org.apache.ambari.server.controller.DeleteIdentityHandler#addDestroyPrincipals
>  
> org.apache.ambari.server.serveraction.kerberos.DestroyPrincipalsServerAction#execute
>  // delete components kerberos identites both in mysql and kdc,use 
> kerberosKeytabPrincipalEntities = 
> kerberosKeytabPrincipalDAO.findByFilters(filters); to get 
> kerberosKeytabPrincipalEntities and delete,in order to delete component 
> kerberos identies,kerberosKeytabPrincipalEntities size should not be 0,that 
> is to say 
> org.apache.ambari.server.orm.dao.KerberosKeytabPrincipalDAO#findByFilter 
> should not return empty
> A-1-1-1-1-1=org.apache.ambari.server.orm.dao.KerberosKeytabPrincipalDAO#findByFilte
> {code}
> A-1-1-1-1-1=org.apache.ambari.server.orm.dao.KerberosKeytabPrincipalDAO#findByFilte
>  call chain
> {code:java}
> for (String hostname : filter.getHostNames()) {
> HostEntity host = hostDAO.findByName(hostname); //find host host=null 
> hasnull=true,if only one host ,this host is re-inserted,will find this 
> host,but this host id has no identies in mysql kkp tables,
> Predicate hostIDPredicate = (hostIds.isEmpty()) ? null : 
> root.get("hostId").in(hostIds);
> Predicate hostNullIDPredicate = (hasNull) ? root.get("hostId").isNull() : 
> null;
> {code}
> A2=org.apache.ambari.server.state.cluster.ClustersImpl#deleteHost call chain
> {code:java}
> org.apache.ambari.server.state.cluster.ClustersImpl#deleteHostEntityRelationships
> org.apache.ambari.server.state.cluster.ClustersImpl#unmapHostFromClusters 
> org.apache.ambari.server.state.cluster.ClustersImpl#unmapHostClusterEntities 
> //delete host cluster mapping  
> org.apache.ambari.server.orm.dao.KerberosKeytabPrincipalDAO#removeByHost
> hostDAO.remove(entity); // Note, if the host is still heartbeating, then new 
> records will be re-inserted into the hosts and hoststate tables
> {code}
> there are 4 reason why some service kerberos identies can not be deleted
>  * one, lost kdc.admin.credential , maybe caused by ambari-server restart
> solve: make sure when delete host kdc.admin.credential exist,if not ,use post 
> to add it
>  * second,A1-1-1-1 execute before A2,that is addDeleteKeytab check host 
> exist(A2 not excute ,so host exist),so add this stage but if this stage 
> exeucte it absolutely cause error,so this ServiceComponentUninstalledEvent 
> fail,the compoent in the event will left kerberos identity in mysql and kdc
> solve: check more times in addDeleteKeytab,wait A2 finish,most times,A2 
> finish before A1-1-1-1,no more than 1 or 2 second
>  * third, A2 execute,but host heartbeating,re-inserted into 
> hosts,A1-1-1-1execute,fall into addDeleteKeytab stage,error 
> solve: check host exist in addDeleteKeytab plus host in any cluster check to 
> make sure this host not a re-inserted host,because re-inserted host has no 
> cluster to mapping
>  * fourth,A1-1-1-1 filter kerberosKeytabPrincipalEntities(kkpes) use 
> A-1-1-1-1-1 but find a re-inserted host so kkpes is size 0 ,this 
> ServiceComponentUninstalledEvent will left componets kerberos identies in 
> mysql and kdc
> solve: A-1-1-1-1-1check host eixst plus host is in cluster to exlude 
> re-inserted host when there is only one host in findByFilter method, (if more 
> than one host use this method ,no error)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to