[
https://issues.apache.org/jira/browse/AMBARI-25672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
h.s updated AMBARI-25672:
-------------------------
Summary: delete host from a kerberos cluster not completely clear all
components kerberos identies in database and kdc (was: delete host from a
kerberos cluster not completely clear all identies in database and kdc)
> delete host from a kerberos cluster not completely clear all components
> kerberos identies in database and kdc
> -------------------------------------------------------------------------------------------------------------
>
> Key: AMBARI-25672
> URL: https://issues.apache.org/jira/browse/AMBARI-25672
> Project: Ambari
> Issue Type: Bug
> Components: ambari-server
> Affects Versions: 2.7.3
> Reporter: h.s
> Priority: Major
>
> step 1:
> # delete a host from a kerberos cluster ,not a master host
> # stop all the service on the host,
> # use api delete host
> step 2:
> # prepare a host, install agent
> # add a node to the cluster use api and install service
> # regenerate_keytab
> # ambari hang at preparing operations/hostname/preparing operations
> it is because step1.3 cannot completely clear all this host kerberos
> idetities in both database(mysql ) and kdc(kdc.admin)
> * in mysql
> there are 3 table kkp_mapping_service, kerberos_keytab_principal,
> kerberos_keytab,kerberos_principal, host related kerberos identities in these
> tables must be deleted completely,
> * in kdc ,
> {code:java}
> kadmin.local
> listprincs *hostnanme*{code}
> will find related identies not deleted completely
> some services kerberos identies in mysql and kdc can be deleted but some
> sevices not,
> if not all service kerberos identies deleted completely,if any service
> kerberos identities left ,next time add a host to this cluster, will hang at
> preparing operations
>
> delete host api call chain in ambari-server
> {code:java}
> org.apache.ambari.server.api.services.HostService#deleteHost
> org.apache.ambari.server.api.services.BaseService#handleRequest
> org.apache.ambari.server.api.services.BaseRequest#process
> org.apache.ambari.server.api.handlers.BaseManagementHandler#handleRequest
> org.apache.ambari.server.api.handlers.DeleteHandler#persist
> org.apache.ambari.server.api.services.persistence.PersistenceManagerImpl#delete
> org.apache.ambari.server.controller.internal.ClusterControllerImpl#deleteResources
> org.apache.ambari.server.controller.internal.AbstractAuthorizedResourceProvider#deleteResources
> org.apache.ambari.server.controller.internal.HostResourceProvider#deleteResourcesAuthorized
> org.apache.ambari.server.controller.internal.HostResourceProvider#deleteHosts
> A=org.apache.ambari.server.controller.internal.HostResourceProvider#processDeleteHostRequests
> {code}
>
> A=org.apache.ambari.server.controller.internal.HostResourceProvider#processDeleteHostRequests
> has some main step
> {code:java}
> A1=org.apache.ambari.server.controller.AmbariManagementControllerImpl#deleteHostComponents
> //this step will delete components and their kerbers identities
> A2=org.apache.ambari.server.state.cluster.ClustersImpl#deleteHost //this step
> will delete host from mysql{code}
>
>
> A1=org.apache.ambari.server.controller.AmbariManagementControllerImpl#deleteHostComponents
> call chain
> {code:java}
> org.apache.ambari.server.state.ServiceComponentImpl#deleteServiceComponentHosts
> A1-1=org.apache.ambari.server.state.svccomphost.ServiceComponentHostImpl#delete
> {code}
> A1-1=org.apache.ambari.server.state.svccomphost.ServiceComponentHostImpl#delete
> call chain
> {code:java}
> org.apache.ambari.server.state.cluster.ClusterImpl#removeServiceComponentHost
> A1-1-1=eventPublisher.publish(event); //publish
> ServiceComponentUninstalledEvent,org.apache.ambari.server.controller.utilities.KerberosIdentityCleaner#componentRemoved
> will deal this event,and delete components kerberos identites,these event
> once publish,next line code will execute,not wait these event finish,
> {code}
> A1-1-1=org.apache.ambari.server.controller.utilities.KerberosIdentityCleaner#componentRemoved
> call chain
> {code:java}
> org.apache.ambari.server.controller.utilities.RemovableIdentities#remove
> org.apache.ambari.server.controller.KerberosHelperImpl#deleteIdentities(org.apache.ambari.server.state.Cluster,
> java.util.List<org.apache.ambari.server.serveraction.kerberos.Component>,
> java.util.Set<java.lang.String>)
> org.apache.ambari.server.controller.KerberosHelperImpl#validateKDCCredentials(org.apache.ambari.server.controller.KerberosDetails,
> org.apache.ambari.server.state.Cluster) //check KDC administrator credentials
> A1-1-1-1=org.apache.ambari.server.controller.DeleteIdentityHandler#addDeleteIdentityStages
> //add stage in prepare delete identies
> {code}
> A1-1-1-1=org.apache.ambari.server.controller.DeleteIdentityHandler#addDeleteIdentityStages
> call chain
> {code:java}
> if (manageIdentities) {
> addPrepareDeleteIdentity(cluster, hostParamsJson, event, commandParameters,
> stageContainer);
> addDeleteKeytab(cluster, commandParameters.getAffectedHostNames(),
> hostParamsJson, commandParameters, stageContainer);
> addDestroyPrincipals(cluster, hostParamsJson, event, commandParameters,
> stageContainer);
> }
> org.apache.ambari.server.controller.DeleteIdentityHandler#addDeleteKeytab
> //check hostexists to decide whether create this stage,in order to delete
> component kerberos identities, this stage should not be created,that is to
> say,host is exist judgement should be false,because A2 has delete this host
> from mysql
> org.apache.ambari.server.controller.DeleteIdentityHandler#addDestroyPrincipals
>
> org.apache.ambari.server.serveraction.kerberos.DestroyPrincipalsServerAction#execute
> // delete components kerberos identites both in mysql and kdc,use
> kerberosKeytabPrincipalEntities =
> kerberosKeytabPrincipalDAO.findByFilters(filters); to get
> kerberosKeytabPrincipalEntities and delete,in order to delete component
> kerberos identies,kerberosKeytabPrincipalEntities size should not be 0,that
> is to say
> org.apache.ambari.server.orm.dao.KerberosKeytabPrincipalDAO#findByFilter
> should not return empty
> A-1-1-1-1-1=org.apache.ambari.server.orm.dao.KerberosKeytabPrincipalDAO#findByFilte
> {code}
> A-1-1-1-1-1=org.apache.ambari.server.orm.dao.KerberosKeytabPrincipalDAO#findByFilte
> call chain
> {code:java}
> for (String hostname : filter.getHostNames()) {
> HostEntity host = hostDAO.findByName(hostname); //find host host=null
> hasnull=true,if only one host ,this host is re-inserted,will find this
> host,but this host id has no identies in mysql kkp tables,
> Predicate hostIDPredicate = (hostIds.isEmpty()) ? null :
> root.get("hostId").in(hostIds);
> Predicate hostNullIDPredicate = (hasNull) ? root.get("hostId").isNull() :
> null;
> {code}
> A2=org.apache.ambari.server.state.cluster.ClustersImpl#deleteHost call chain
> {code:java}
> org.apache.ambari.server.state.cluster.ClustersImpl#deleteHostEntityRelationships
> org.apache.ambari.server.state.cluster.ClustersImpl#unmapHostFromClusters
> org.apache.ambari.server.state.cluster.ClustersImpl#unmapHostClusterEntities
> //delete host cluster mapping
> org.apache.ambari.server.orm.dao.KerberosKeytabPrincipalDAO#removeByHost
> hostDAO.remove(entity); // Note, if the host is still heartbeating, then new
> records will be re-inserted into the hosts and hoststate tables
> {code}
> there are 4 reason why some service kerberos identies can not be deleted
> * one, lost kdc.admin.credential , maybe caused by ambari-server restart
> solve: make sure when delete host kdc.admin.credential exist,if not ,use post
> to add it
> * second,A1-1-1-1 execute before A2,that is addDeleteKeytab check host
> exist(A2 not excute ,so host exist),so add this stage but if this stage
> exeucte it absolutely cause error,so this ServiceComponentUninstalledEvent
> fail,the compoent in the event will left kerberos identity in mysql and kdc
> solve: check more times in addDeleteKeytab,wait A2 finish,most times,A2
> finish before A1-1-1-1,no more than 1 or 2 second
> * third, A2 execute,but host heartbeating,re-inserted into
> hosts,A1-1-1-1execute,fall into addDeleteKeytab stage,error
> solve: check host exist in addDeleteKeytab plus host in any cluster check to
> make sure this host not a re-inserted host,because re-inserted host has no
> cluster to mapping
> * fourth,A1-1-1-1 filter kerberosKeytabPrincipalEntities(kkpes) use
> A-1-1-1-1-1 but find a re-inserted host so kkpes is size 0 ,this
> ServiceComponentUninstalledEvent will left componets kerberos identies in
> mysql and kdc
> solve: A-1-1-1-1-1check host eixst plus host is in cluster to exlude
> re-inserted host when there is only one host in findByFilter method, (if more
> than one host use this method ,no error)
--
This message was sent by Atlassian Jira
(v8.3.4#803005)