[
https://issues.apache.org/jira/browse/AMBARI-25672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
h.s updated AMBARI-25672:
-------------------------
Description:
step 1:
# delete a host from a kerberos cluster ,not a master host
# stop all the service on the host,
# use api delete host
step 2:
# prepare a host, install agent
# add a node to the cluster use api and install service
# regenerate_keytab
# ambari hang at preparing operations/hostname/preparing operations
it is because step1.3 cannot completely clear all this host kerberos idetities
in both database(mysql ) and kdc(kdc.admin)
* in mysql
there are 3 table kkp_mapping_service, kerberos_keytab_principal,
kerberos_keytab,kerberos_principal, host related kerberos identities in these
tables must be deleted completely,
* in kdc ,
{code:java}
kadmin.local
listprincs *hostnanme*{code}
will find related identies not deleted completely
some services kerberos identies in mysql and kdc can be deleted but some
sevices not,
if not all service kerberos identies deleted completely,if any service kerberos
identities left ,next time add a host to this cluster, will hang at preparing
operations
delete host api call chain in ambari-server
{code:java}
org.apache.ambari.server.api.services.HostService#deleteHost
org.apache.ambari.server.api.services.BaseService#handleRequest
org.apache.ambari.server.api.services.BaseRequest#process
org.apache.ambari.server.api.handlers.BaseManagementHandler#handleRequest
org.apache.ambari.server.api.handlers.DeleteHandler#persist
org.apache.ambari.server.api.services.persistence.PersistenceManagerImpl#delete
org.apache.ambari.server.controller.internal.ClusterControllerImpl#deleteResources
org.apache.ambari.server.controller.internal.AbstractAuthorizedResourceProvider#deleteResources
org.apache.ambari.server.controller.internal.HostResourceProvider#deleteResourcesAuthorized
org.apache.ambari.server.controller.internal.HostResourceProvider#deleteHosts
A=org.apache.ambari.server.controller.internal.HostResourceProvider#processDeleteHostRequests
{code}
A=org.apache.ambari.server.controller.internal.HostResourceProvider#processDeleteHostRequests
has some main step
{code:java}
A1=org.apache.ambari.server.controller.AmbariManagementControllerImpl#deleteHostComponents
//this step will delete components and their kerbers identities
A2=org.apache.ambari.server.state.cluster.ClustersImpl#deleteHost //this step
will delete host from mysql{code}
A1=org.apache.ambari.server.controller.AmbariManagementControllerImpl#deleteHostComponents
call chain
{code:java}
org.apache.ambari.server.state.ServiceComponentImpl#deleteServiceComponentHosts
A1-1=org.apache.ambari.server.state.svccomphost.ServiceComponentHostImpl#delete
{code}
A1-1=org.apache.ambari.server.state.svccomphost.ServiceComponentHostImpl#delete
call chain
{code:java}
org.apache.ambari.server.state.cluster.ClusterImpl#removeServiceComponentHost
A1-1-1=eventPublisher.publish(event); //publish
ServiceComponentUninstalledEvent,org.apache.ambari.server.controller.utilities.KerberosIdentityCleaner#componentRemoved
will deal this event,and delete components kerberos identites,these event once
publish,next line code will execute,not wait these event finish,
{code}
A1-1-1=org.apache.ambari.server.controller.utilities.KerberosIdentityCleaner#componentRemoved
call chain
{code:java}
org.apache.ambari.server.controller.utilities.RemovableIdentities#remove
org.apache.ambari.server.controller.KerberosHelperImpl#deleteIdentities(org.apache.ambari.server.state.Cluster,
java.util.List<org.apache.ambari.server.serveraction.kerberos.Component>,
java.util.Set<java.lang.String>)
org.apache.ambari.server.controller.KerberosHelperImpl#validateKDCCredentials(org.apache.ambari.server.controller.KerberosDetails,
org.apache.ambari.server.state.Cluster) //check KDC administrator credentials
A1-1-1-1=org.apache.ambari.server.controller.DeleteIdentityHandler#addDeleteIdentityStages
//add stage in prepare delete identies
{code}
A1-1-1-1=org.apache.ambari.server.controller.DeleteIdentityHandler#addDeleteIdentityStages
call chain
{code:java}
if (manageIdentities) {
addPrepareDeleteIdentity(cluster, hostParamsJson, event, commandParameters,
stageContainer);
addDeleteKeytab(cluster, commandParameters.getAffectedHostNames(),
hostParamsJson, commandParameters, stageContainer);
addDestroyPrincipals(cluster, hostParamsJson, event, commandParameters,
stageContainer);
}
org.apache.ambari.server.controller.DeleteIdentityHandler#addDeleteKeytab
//check hostexists to decide whether create this stage,in order to delete
component kerberos identities, this stage should not be created,that is to
say,host is exist judgement should be false,because A2 has delete this host
from mysql
org.apache.ambari.server.controller.DeleteIdentityHandler#addDestroyPrincipals
org.apache.ambari.server.serveraction.kerberos.DestroyPrincipalsServerAction#execute
// delete components kerberos identites both in mysql and kdc,use
kerberosKeytabPrincipalEntities =
kerberosKeytabPrincipalDAO.findByFilters(filters); to get
kerberosKeytabPrincipalEntities and delete,in order to delete component
kerberos identies,kerberosKeytabPrincipalEntities size should not be 0,that is
to say org.apache.ambari.server.orm.dao.KerberosKeytabPrincipalDAO#findByFilter
should not return empty
A-1-1-1-1-1=org.apache.ambari.server.orm.dao.KerberosKeytabPrincipalDAO#findByFilte
{code}
A-1-1-1-1-1=org.apache.ambari.server.orm.dao.KerberosKeytabPrincipalDAO#findByFilte
call chain
{code:java}
for (String hostname : filter.getHostNames()) {
HostEntity host = hostDAO.findByName(hostname); //find host host=null
hasnull=true,if only one host ,this host is re-inserted,will find this host,but
this host id has no identies in mysql kkp tables,
Predicate hostIDPredicate = (hostIds.isEmpty()) ? null :
root.get("hostId").in(hostIds);
Predicate hostNullIDPredicate = (hasNull) ? root.get("hostId").isNull() : null;
{code}
A2=org.apache.ambari.server.state.cluster.ClustersImpl#deleteHost call chain
{code:java}
org.apache.ambari.server.state.cluster.ClustersImpl#deleteHostEntityRelationships
org.apache.ambari.server.state.cluster.ClustersImpl#unmapHostFromClusters
org.apache.ambari.server.state.cluster.ClustersImpl#unmapHostClusterEntities
//delete host cluster mapping
org.apache.ambari.server.orm.dao.KerberosKeytabPrincipalDAO#removeByHost
hostDAO.remove(entity); // Note, if the host is still heartbeating, then new
records will be re-inserted into the hosts and hoststate tables
{code}
there are 4 reason why some service kerberos identies can not be deleted
* one, lost kdc.admin.credential , maybe caused by ambari-server restart
solve: make sure when delete host kdc.admin.credential exist,if not ,use post
to add it
* second,A1-1-1-1 execute before A2,that is addDeleteKeytab check host
exist(A2 not excute ,so host exist),so add this stage but if this stage exeucte
it absolutely cause error,so this ServiceComponentUninstalledEvent fail,the
compoent in the event will left kerberos identity in mysql and kdc
solve: check more times in addDeleteKeytab,wait A2 finish,most times,A2 finish
before A1-1-1-1,no more than 1 or 2 second
* third, A2 execute,but host heartbeating,re-inserted into
hosts,A1-1-1-1execute,fall into addDeleteKeytab stage,error
solve: check host exist in addDeleteKeytab plus host in any cluster check to
make sure this host not a re-inserted host,because re-inserted host has no
cluster to mapping
* fourth,A1-1-1-1 filter kerberosKeytabPrincipalEntities(kkpes) use
A-1-1-1-1-1 but find a re-inserted host so kkpes is size 0 ,this
ServiceComponentUninstalledEvent will left componets kerberos identies in mysql
and kdc
solve: A-1-1-1-1-1check host eixst plus host is in cluster to exlude
re-inserted host when there is only one host in findByFilter method, (if more
than one host use this method ,no error)
was:
step 1:
# delete a host from a kerberos cluster ,not a master host
# stop all the service on the host,
# use api delete host
step 2:
# prepare a host, install agent
# add a node to the cluster use api and install service
# regenerate_keytab
# ambari hang at preparing operations/hostname/preparing operations
it is because step1.3 cannot completely clear all this host kerberos idetities
in both database(mysql ) and kdc(kdc.admin)
* in mysql
there are 3 table kkp_mapping_service, kerberos_keytab_principal,
kerberos_keytab,kerberos_principal, host related kerberos identities in these
tables must be deleted completely,
* in kdc ,
{code:java}
kadmin.local
listprincs *hostnanme*{code}
will find related identies not deleted completely
some services kerberos identies in mysql and kdc can be deleted but some
sevices not,
if not all service kerberos identies deleted completely,if any service kerberos
identities left ,next time add a host to this cluster, will hang at preparing
operations
delete host api call chain in ambari-server
{code:java}
org.apache.ambari.server.api.services.HostService#deleteHost
org.apache.ambari.server.api.services.BaseService#handleRequest
org.apache.ambari.server.api.services.BaseRequest#process
org.apache.ambari.server.api.handlers.BaseManagementHandler#handleRequest
org.apache.ambari.server.api.handlers.DeleteHandler#persist
org.apache.ambari.server.api.services.persistence.PersistenceManagerImpl#delete
org.apache.ambari.server.controller.internal.ClusterControllerImpl#deleteResources
org.apache.ambari.server.controller.internal.AbstractAuthorizedResourceProvider#deleteResources
org.apache.ambari.server.controller.internal.HostResourceProvider#deleteResourcesAuthorized
org.apache.ambari.server.controller.internal.HostResourceProvider#deleteHosts
A=org.apache.ambari.server.controller.internal.HostResourceProvider#processDeleteHostRequests
{code}
A=org.apache.ambari.server.controller.internal.HostResourceProvider#processDeleteHostRequests
has some main step
{code:java}
A1=org.apache.ambari.server.controller.AmbariManagementControllerImpl#deleteHostComponents
//this step will delete components and their kerbers identities
A2=org.apache.ambari.server.state.cluster.ClustersImpl#deleteHost //this step
will delete host from mysql{code}
A1=org.apache.ambari.server.controller.AmbariManagementControllerImpl#deleteHostComponents
call chain
{code:java}
org.apache.ambari.server.state.ServiceComponentImpl#deleteServiceComponentHosts
A1-1=org.apache.ambari.server.state.svccomphost.ServiceComponentHostImpl#delete
{code}
A1-1=org.apache.ambari.server.state.svccomphost.ServiceComponentHostImpl#delete
call chain
{code:java}
org.apache.ambari.server.state.cluster.ClusterImpl#removeServiceComponentHost
A1-1-1=eventPublisher.publish(event); //publish
ServiceComponentUninstalledEvent,org.apache.ambari.server.controller.utilities.KerberosIdentityCleaner#componentRemoved
will deal this event,and delete components kerberos identites,these event once
publish,next line code will execute,not wait these event finish,
{code}
A1-1-1=org.apache.ambari.server.controller.utilities.KerberosIdentityCleaner#componentRemoved
call chain
{code:java}
org.apache.ambari.server.controller.utilities.RemovableIdentities#remove
org.apache.ambari.server.controller.KerberosHelperImpl#deleteIdentities(org.apache.ambari.server.state.Cluster,
java.util.List<org.apache.ambari.server.serveraction.kerberos.Component>,
java.util.Set<java.lang.String>)
org.apache.ambari.server.controller.KerberosHelperImpl#validateKDCCredentials(org.apache.ambari.server.controller.KerberosDetails,
org.apache.ambari.server.state.Cluster) //check KDC administrator credentials
A1-1-1-1=org.apache.ambari.server.controller.DeleteIdentityHandler#addDeleteIdentityStages
//add stage in prepare delete identies
{code}
A1-1-1-1=org.apache.ambari.server.controller.DeleteIdentityHandler#addDeleteIdentityStages
call chain
{code:java}
if (manageIdentities) {
addPrepareDeleteIdentity(cluster, hostParamsJson, event, commandParameters,
stageContainer);
addDeleteKeytab(cluster, commandParameters.getAffectedHostNames(),
hostParamsJson, commandParameters, stageContainer);
addDestroyPrincipals(cluster, hostParamsJson, event, commandParameters,
stageContainer);
}
org.apache.ambari.server.controller.DeleteIdentityHandler#addDeleteKeytab
//check hostexists to decide whether create this stage,in order to delete
component kerberos identities, this stage should not be created,that is to
say,host is exist judgement should be false,because A2 has delete this host
from mysql
org.apache.ambari.server.controller.DeleteIdentityHandler#addDestroyPrincipals
org.apache.ambari.server.serveraction.kerberos.DestroyPrincipalsServerAction#execute
// delete components kerberos identites both in mysql and kdc,use
kerberosKeytabPrincipalEntities =
kerberosKeytabPrincipalDAO.findByFilters(filters); to get
kerberosKeytabPrincipalEntities and delete,in order to delete component
kerberos identies,kerberosKeytabPrincipalEntities size should not be 0,that is
to say org.apache.ambari.server.orm.dao.KerberosKeytabPrincipalDAO#findByFilter
should not return empty
A-1-1-1-1-1=org.apache.ambari.server.orm.dao.KerberosKeytabPrincipalDAO#findByFilte
{code}
A-1-1-1-1-1=org.apache.ambari.server.orm.dao.KerberosKeytabPrincipalDAO#findByFilte
call chain
{code:java}
for (String hostname : filter.getHostNames()) {
HostEntity host = hostDAO.findByName(hostname); //find host host=null
hasnull=true,if only one host ,this host is re-inserted,will find this host,but
this host id has no identies in mysql kkp tables,
Predicate hostIDPredicate = (hostIds.isEmpty()) ? null :
root.get("hostId").in(hostIds);
Predicate hostNullIDPredicate = (hasNull) ? root.get("hostId").isNull() : null;
{code}
A2=org.apache.ambari.server.state.cluster.ClustersImpl#deleteHost call chain
{code:java}
org.apache.ambari.server.state.cluster.ClustersImpl#deleteHostEntityRelationships
org.apache.ambari.server.state.cluster.ClustersImpl#unmapHostFromClusters
org.apache.ambari.server.state.cluster.ClustersImpl#unmapHostClusterEntities
//delete host cluster mapping
org.apache.ambari.server.orm.dao.KerberosKeytabPrincipalDAO#removeByHost
hostDAO.remove(entity); // Note, if the host is still heartbeating, then new
records will be re-inserted into the hosts and hoststate tables
{code}
there are 4 reason why some service kerberos identies can not be deleted
* one, lost kdc.admin.credential , maybe caused by ambari-server restart
solve: make sure when delete host kdc.admin.credential exist,if not ,use post
to add it
* second,A1-1-1-1 execute before A2,that is addDeleteKeytab check host
exist(A2 not excute ,so host exist),so add this stage but if this stage exeucte
it absolutely cause error,so this ServiceComponentUninstalledEvent fail,the
compoent in the event will left kerberos identity in mysql and kdc
solve: check more times in addDeleteKeytab,wait A2 finish,most times,A2 finish
before A1-1-1-1,no more than 1 or 2 second
* third, A2 execute,but host heartbeating,re-inserted into
hosts,A1-1-1-1execute,fall into addDeleteKeytab stage,error
solve: check host exist in addDeleteKeytab plus host in any cluster check to
make sure this host not a re-inserted host,because re-inserted host has no
cluster to mapping
* fourth,A1-1-1-1 filter kerberosKeytabPrincipalEntities(kkpes) use
A-1-1-1-1-1 but find a re-inserted host so kkpes is size 0 ,this
ServiceComponentUninstalledEvent will left componets kerberos identies in mysql
and kdc
solve:A-1-1-1-1-1check host eixst plus host is in cluster to exlude re-inserted
host when there is only one host in findByFilter method, (if more than one host
use this method ,no error)
> delete host from a kerberos cluster not completely clear all identies in
> database and kdc
> -----------------------------------------------------------------------------------------
>
> Key: AMBARI-25672
> URL: https://issues.apache.org/jira/browse/AMBARI-25672
> Project: Ambari
> Issue Type: Bug
> Components: ambari-server
> Affects Versions: 2.7.3
> Reporter: h.s
> Priority: Major
>
> step 1:
> # delete a host from a kerberos cluster ,not a master host
> # stop all the service on the host,
> # use api delete host
> step 2:
> # prepare a host, install agent
> # add a node to the cluster use api and install service
> # regenerate_keytab
> # ambari hang at preparing operations/hostname/preparing operations
> it is because step1.3 cannot completely clear all this host kerberos
> idetities in both database(mysql ) and kdc(kdc.admin)
> * in mysql
> there are 3 table kkp_mapping_service, kerberos_keytab_principal,
> kerberos_keytab,kerberos_principal, host related kerberos identities in these
> tables must be deleted completely,
> * in kdc ,
> {code:java}
> kadmin.local
> listprincs *hostnanme*{code}
> will find related identies not deleted completely
> some services kerberos identies in mysql and kdc can be deleted but some
> sevices not,
> if not all service kerberos identies deleted completely,if any service
> kerberos identities left ,next time add a host to this cluster, will hang at
> preparing operations
>
> delete host api call chain in ambari-server
> {code:java}
> org.apache.ambari.server.api.services.HostService#deleteHost
> org.apache.ambari.server.api.services.BaseService#handleRequest
> org.apache.ambari.server.api.services.BaseRequest#process
> org.apache.ambari.server.api.handlers.BaseManagementHandler#handleRequest
> org.apache.ambari.server.api.handlers.DeleteHandler#persist
> org.apache.ambari.server.api.services.persistence.PersistenceManagerImpl#delete
> org.apache.ambari.server.controller.internal.ClusterControllerImpl#deleteResources
> org.apache.ambari.server.controller.internal.AbstractAuthorizedResourceProvider#deleteResources
> org.apache.ambari.server.controller.internal.HostResourceProvider#deleteResourcesAuthorized
> org.apache.ambari.server.controller.internal.HostResourceProvider#deleteHosts
> A=org.apache.ambari.server.controller.internal.HostResourceProvider#processDeleteHostRequests
> {code}
>
> A=org.apache.ambari.server.controller.internal.HostResourceProvider#processDeleteHostRequests
> has some main step
> {code:java}
> A1=org.apache.ambari.server.controller.AmbariManagementControllerImpl#deleteHostComponents
> //this step will delete components and their kerbers identities
> A2=org.apache.ambari.server.state.cluster.ClustersImpl#deleteHost //this step
> will delete host from mysql{code}
>
>
> A1=org.apache.ambari.server.controller.AmbariManagementControllerImpl#deleteHostComponents
> call chain
> {code:java}
> org.apache.ambari.server.state.ServiceComponentImpl#deleteServiceComponentHosts
> A1-1=org.apache.ambari.server.state.svccomphost.ServiceComponentHostImpl#delete
> {code}
> A1-1=org.apache.ambari.server.state.svccomphost.ServiceComponentHostImpl#delete
> call chain
> {code:java}
> org.apache.ambari.server.state.cluster.ClusterImpl#removeServiceComponentHost
> A1-1-1=eventPublisher.publish(event); //publish
> ServiceComponentUninstalledEvent,org.apache.ambari.server.controller.utilities.KerberosIdentityCleaner#componentRemoved
> will deal this event,and delete components kerberos identites,these event
> once publish,next line code will execute,not wait these event finish,
> {code}
> A1-1-1=org.apache.ambari.server.controller.utilities.KerberosIdentityCleaner#componentRemoved
> call chain
> {code:java}
> org.apache.ambari.server.controller.utilities.RemovableIdentities#remove
> org.apache.ambari.server.controller.KerberosHelperImpl#deleteIdentities(org.apache.ambari.server.state.Cluster,
> java.util.List<org.apache.ambari.server.serveraction.kerberos.Component>,
> java.util.Set<java.lang.String>)
> org.apache.ambari.server.controller.KerberosHelperImpl#validateKDCCredentials(org.apache.ambari.server.controller.KerberosDetails,
> org.apache.ambari.server.state.Cluster) //check KDC administrator credentials
> A1-1-1-1=org.apache.ambari.server.controller.DeleteIdentityHandler#addDeleteIdentityStages
> //add stage in prepare delete identies
> {code}
> A1-1-1-1=org.apache.ambari.server.controller.DeleteIdentityHandler#addDeleteIdentityStages
> call chain
> {code:java}
> if (manageIdentities) {
> addPrepareDeleteIdentity(cluster, hostParamsJson, event, commandParameters,
> stageContainer);
> addDeleteKeytab(cluster, commandParameters.getAffectedHostNames(),
> hostParamsJson, commandParameters, stageContainer);
> addDestroyPrincipals(cluster, hostParamsJson, event, commandParameters,
> stageContainer);
> }
> org.apache.ambari.server.controller.DeleteIdentityHandler#addDeleteKeytab
> //check hostexists to decide whether create this stage,in order to delete
> component kerberos identities, this stage should not be created,that is to
> say,host is exist judgement should be false,because A2 has delete this host
> from mysql
> org.apache.ambari.server.controller.DeleteIdentityHandler#addDestroyPrincipals
>
> org.apache.ambari.server.serveraction.kerberos.DestroyPrincipalsServerAction#execute
> // delete components kerberos identites both in mysql and kdc,use
> kerberosKeytabPrincipalEntities =
> kerberosKeytabPrincipalDAO.findByFilters(filters); to get
> kerberosKeytabPrincipalEntities and delete,in order to delete component
> kerberos identies,kerberosKeytabPrincipalEntities size should not be 0,that
> is to say
> org.apache.ambari.server.orm.dao.KerberosKeytabPrincipalDAO#findByFilter
> should not return empty
> A-1-1-1-1-1=org.apache.ambari.server.orm.dao.KerberosKeytabPrincipalDAO#findByFilte
> {code}
> A-1-1-1-1-1=org.apache.ambari.server.orm.dao.KerberosKeytabPrincipalDAO#findByFilte
> call chain
> {code:java}
> for (String hostname : filter.getHostNames()) {
> HostEntity host = hostDAO.findByName(hostname); //find host host=null
> hasnull=true,if only one host ,this host is re-inserted,will find this
> host,but this host id has no identies in mysql kkp tables,
> Predicate hostIDPredicate = (hostIds.isEmpty()) ? null :
> root.get("hostId").in(hostIds);
> Predicate hostNullIDPredicate = (hasNull) ? root.get("hostId").isNull() :
> null;
> {code}
> A2=org.apache.ambari.server.state.cluster.ClustersImpl#deleteHost call chain
> {code:java}
> org.apache.ambari.server.state.cluster.ClustersImpl#deleteHostEntityRelationships
> org.apache.ambari.server.state.cluster.ClustersImpl#unmapHostFromClusters
> org.apache.ambari.server.state.cluster.ClustersImpl#unmapHostClusterEntities
> //delete host cluster mapping
> org.apache.ambari.server.orm.dao.KerberosKeytabPrincipalDAO#removeByHost
> hostDAO.remove(entity); // Note, if the host is still heartbeating, then new
> records will be re-inserted into the hosts and hoststate tables
> {code}
> there are 4 reason why some service kerberos identies can not be deleted
> * one, lost kdc.admin.credential , maybe caused by ambari-server restart
> solve: make sure when delete host kdc.admin.credential exist,if not ,use post
> to add it
> * second,A1-1-1-1 execute before A2,that is addDeleteKeytab check host
> exist(A2 not excute ,so host exist),so add this stage but if this stage
> exeucte it absolutely cause error,so this ServiceComponentUninstalledEvent
> fail,the compoent in the event will left kerberos identity in mysql and kdc
> solve: check more times in addDeleteKeytab,wait A2 finish,most times,A2
> finish before A1-1-1-1,no more than 1 or 2 second
> * third, A2 execute,but host heartbeating,re-inserted into
> hosts,A1-1-1-1execute,fall into addDeleteKeytab stage,error
> solve: check host exist in addDeleteKeytab plus host in any cluster check to
> make sure this host not a re-inserted host,because re-inserted host has no
> cluster to mapping
> * fourth,A1-1-1-1 filter kerberosKeytabPrincipalEntities(kkpes) use
> A-1-1-1-1-1 but find a re-inserted host so kkpes is size 0 ,this
> ServiceComponentUninstalledEvent will left componets kerberos identies in
> mysql and kdc
> solve: A-1-1-1-1-1check host eixst plus host is in cluster to exlude
> re-inserted host when there is only one host in findByFilter method, (if more
> than one host use this method ,no error)
--
This message was sent by Atlassian Jira
(v8.3.4#803005)