[ 
https://issues.apache.org/jira/browse/AMBARI-22235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16203686#comment-16203686
 ] 

slim bouguerra commented on AMBARI-22235:
-----------------------------------------

As per [~rlevas] explanation. 
The issue is related to the default value of the Druid principal...
{code}
"value": "${druid-env/druid_user}@${realm}",
{code}
Though technically this is correct, by not adding some potentially unique value 
to the principal name we run the risk of invalidating the Druid keytab file if 
multiple clusters are configured to use the same KDC. This is because Ambari 
will change the password for a principal when it goes to create the relevant 
keytab file.

All headless (or user) Kerberos identities should include something like the 
cluster name in its principal name. Ambari will do this for you by adding the 
{principal_suffix}} variable. For example:
{code}
"value": "${druid-env/druid_user}${principal_suffix}@${realm}",
{code}
By default, this is set to a dash followed by the cluster's name. This may 
result in the following principal name:
{code}
[email protected]
{code}


> Druid service check failed during EU
> ------------------------------------
>
>                 Key: AMBARI-22235
>                 URL: https://issues.apache.org/jira/browse/AMBARI-22235
>             Project: Ambari
>          Issue Type: Bug
>            Reporter: slim bouguerra
>
> Observed this issue on two clusters
> Druid service check failed during EU 
> {code}
> Traceback (most recent call last):
>   File 
> "/var/lib/ambari-agent/cache/common-services/DRUID/0.10.1/package/scripts/service_check.py",
>  line 44, in <module>
>     ServiceCheck().execute()
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py",
>  line 367, in execute
>     method(env)
>   File 
> "/var/lib/ambari-agent/cache/common-services/DRUID/0.10.1/package/scripts/service_check.py",
>  line 30, in service_check
>     self.checkComponent(params, "druid_coordinator", "druid-coordinator")
>   File 
> "/var/lib/ambari-agent/cache/common-services/DRUID/0.10.1/package/scripts/service_check.py",
>  line 40, in checkComponent
>     logoutput=True)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", 
> line 166, in __init__
>     self.env.run()
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", 
> line 160, in run
>     self.run_action(resource, action)
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", 
> line 124, in run_action
>     provider_action()
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py",
>  line 262, in action_run
>     tries=self.resource.tries, try_sleep=self.resource.try_sleep)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", 
> line 72, in inner
>     result = function(command, **kwargs)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", 
> line 102, in checked_call
>     tries=tries, try_sleep=try_sleep, 
> timeout_kill_strategy=timeout_kill_strategy)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", 
> line 150, in _call_wrapper
>     result = _call(command, **kwargs_copy)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", 
> line 303, in _call
>     raise ExecutionFailed(err_msg, code, out, err)
> resource_management.core.exceptions.ExecutionFailed: Execution of 'curl -s -o 
> /dev/null -w'%{http_code}' --negotiate -u: -k 
> ctr-e134-1499953498516-217002-01-000010.hwx.site:8081/status | grep 200' 
> returned 1.
> {code}
> Here is the failure stack trace from druid logs 
> {code} 
>  
> Caused by: sun.security.krb5.Asn1Exception: Identifier doesn't match expected 
> value (906)
>         at sun.security.krb5.internal.KDCRep.init(KDCRep.java:140) 
> ~[?:1.7.0_95]
>         at sun.security.krb5.internal.ASRep.init(ASRep.java:64) ~[?:1.7.0_95]
>         at sun.security.krb5.internal.ASRep.<init>(ASRep.java:59) 
> ~[?:1.7.0_95]
>         at sun.security.krb5.KrbAsRep.<init>(KrbAsRep.java:60) ~[?:1.7.0_95]
>         at sun.security.krb5.KrbAsReqBuilder.send(KrbAsReqBuilder.java:316) 
> ~[?:1.7.0_95]
>         at sun.security.krb5.KrbAsReqBuilder.action(KrbAsReqBuilder.java:361) 
> ~[?:1.7.0_95]
>         at 
> com.sun.security.auth.module.Krb5LoginModule.attemptAuthentication(Krb5LoginModule.java:735)
>  ~[?:1.7.0_95]
>         at 
> com.sun.security.auth.module.Krb5LoginModule.login(Krb5LoginModule.java:584) 
> ~[?:1.7.0_95]
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[?:1.7.0_95]
>         at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
> ~[?:1.7.0_95]
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.7.0_95]
>         at java.lang.reflect.Method.invoke(Method.java:606) ~[?:1.7.0_95]
>         at 
> javax.security.auth.login.LoginContext.invoke(LoginContext.java:762) 
> ~[?:1.7.0_95]
>         at 
> javax.security.auth.login.LoginContext.access$000(LoginContext.java:203) 
> ~[?:1.7.0_95]
>         at 
> javax.security.auth.login.LoginContext$4.run(LoginContext.java:690) 
> ~[?:1.7.0_95]
>         at 
> javax.security.auth.login.LoginContext$4.run(LoginContext.java:688) 
> ~[?:1.7.0_95]
>         at java.security.AccessController.doPrivileged(Native Method) 
> ~[?:1.7.0_95]
>         at 
> javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:687) 
> ~[?:1.7.0_95]
>         at 
> javax.security.auth.login.LoginContext.login(LoginContext.java:595) 
> ~[?:1.7.0_95]
>         at 
> org.apache.hadoop.security.UserGroupInformation.loginUserFromKeytab(UserGroupInformation.java:1089)
>  ~[?:?]
>         at 
> io.druid.storage.hdfs.HdfsStorageAuthentication.authenticate(HdfsStorageAuthentication.java:65)
>  ~[?:?]
>         ... 10 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to