[ 
https://issues.apache.org/jira/browse/METRON-1266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16211617#comment-16211617
 ] 

Nick Allen edited comment on METRON-1266 at 10/19/17 7:46 PM:
--------------------------------------------------------------

I have found that the following property needs to be set on the Profiler 
topology when running in a Kerberized environment.  This is similar to how the 
other topologies, like Enrichment, are already configured.
{code}
    topology.auto-credentials: 
['org.apache.storm.security.auth.kerberos.AutoTGT']
{code}

After finding this, the big mystery for me was why the configuration miss did 
not cause this issue in all kerberized environments, all the time.  Surely this 
miss should break the Profiler when running in any Kerberized environment and 
so should have been caught sooner.

I think I understand why now.  The problem is obviously that a ticket cannot be 
found to authenticate when attempting to flush profile measurements to HBase. 
Due to this configuration miss, the Profiler topology itself is not able to 
generate Kerberos tickets for authentication.  At the same time, if the ticket 
cache on the worker node is already populated with a valid ticket, then this 
issue will not occur.  The ticket cache can be populated if another process 
generates a ticket or a user manually kinits on the same node.

This explains why the problem occurs sporadically and only in some 
environments.  This issue is less likely to occur in an environment, like Full 
Dev, where there are fewer, more active nodes.  In this case, it is likely that 
some other process or user already pre-populated the ticket cache.  In a 
larger, multi-node cluster, the ticket cache is less likely to be populated.


was (Author: nickwallen):
I have found that the following property needs to be set on the Profiler 
topology when running in a Kerberized environment.  This is similar to how the 
other topologies, like Enrichment, are already configured.
{code}
    topology.auto-credentials: 
['org.apache.storm.security.auth.kerberos.AutoTGT']
{code}

After finding this, the big mystery for me was why the configuration miss did 
not cause this issue in all kerberized environments, all the time.  I think I 
understand why now.  The problem is obviously that a ticket cannot be found to 
authenticate when attempting to flush profile measurements to HBase. Due to 
this configuration miss, the Profiler topology itself is not able to generate 
Kerberos tickets for authentication.  At the same time, if the ticket cache on 
the worker node is already populated with a valid ticket, then this issue will 
not occur.  The ticket cache can be populated if another process generates a 
ticket or a user manually kinits on the same node.

This explains why the problem occurs sporadically and only in some 
environments.  This issue is less likely to occur in an environment, like Full 
Dev, where there are fewer, more active nodes.  In this case, it is likely that 
some other process or user already pre-populated the ticket cache.  In a 
larger, multi-node cluster, the ticket cache is less likely to be populated.

> Profiler - No valid credentials provided
> ----------------------------------------
>
>                 Key: METRON-1266
>                 URL: https://issues.apache.org/jira/browse/METRON-1266
>             Project: Metron
>          Issue Type: Bug
>    Affects Versions: 0.4.1
>            Reporter: Nick Allen
>            Assignee: Nick Allen
>             Fix For: Next + 1
>
>
> When running the Profiler on a cluster that has multiple nodes and is secured 
> by Kerberos, it was observed that the HBaseBolt was unable to write to HBase. 
>  The Storm worker running the HBaseBolt logged the following exception.  This 
> does not occur all the time and does not occur in all environments.
> {code}
> 2017-10-19 14:51:00.146 o.a.h.h.i.AbstractRpcClient [ERROR] SASL 
> authentication failed. The most likely cause is missing or invalid 
> credentials. Consider 'kinit'.
> javax.security.sasl.SaslException: GSS initiate failed
>       at 
> com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211)
>  ~[?:1.8.0_144]
>       at 
> org.apache.hadoop.hbase.security.HBaseSaslRpcClient.saslConnect(HBaseSaslRpcClient.java:179)
>  ~[stormjar.jar:?]
>       at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupSaslConnection(RpcClientImpl.java:609)
>  ~[stormjar.jar:?]
>       at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.access$600(RpcClientImpl.java:154)
>  [stormjar.jar:?]
>       at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$2.run(RpcClientImpl.java:735)
>  ~[stormjar.jar:?]
>       at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$2.run(RpcClientImpl.java:732)
>  ~[stormjar.jar:?]
>       at java.security.AccessController.doPrivileged(Native Method) 
> ~[?:1.8.0_144]
>       at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_144]
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>  ~[stormjar.jar:?]
>       at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupIOstreams(RpcClientImpl.java:732)
>  [stormjar.jar:?]
>       at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.writeRequest(RpcClientImpl.java:885)
>  [stormjar.jar:?]
>       at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.tracedWriteRequest(RpcClientImpl.java:854)
>  [stormjar.jar:?]
>       at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1180) 
> [stormjar.jar:?]
>       at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:213)
>  [stormjar.jar:?]
>       at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:287)
>  [stormjar.jar:?]
>       at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(ClientProtos.java:32651)
>  [stormjar.jar:?]
>       at 
> org.apache.hadoop.hbase.client.ClientSmallScanner$SmallScannerCallable.call(ClientSmallScanner.java:201)
>  [stormjar.jar:?]
>       at 
> org.apache.hadoop.hbase.client.ClientSmallScanner$SmallScannerCallable.call(ClientSmallScanner.java:180)
>  [stormjar.jar:?]
>       at 
> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200)
>  [stormjar.jar:?]
>       at 
> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:369)
>  [stormjar.jar:?]
>       at 
> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:343)
>  [stormjar.jar:?]
>       at 
> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:126)
>  [stormjar.jar:?]
>       at 
> org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture.run(ResultBoundedCompletionService.java:64)
>  [stormjar.jar:?]
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  [?:1.8.0_144]
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  [?:1.8.0_144]
>       at java.lang.Thread.run(Thread.java:748) [?:1.8.0_144]
> Caused by: org.ietf.jgss.GSSException: No valid credentials provided 
> (Mechanism level: Failed to find any Kerberos tgt)
>       at 
> sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:147)
>  ~[?:1.8.0_144]
>       at 
> sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:122)
>  ~[?:1.8.0_144]
>       at 
> sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:187)
>  ~[?:1.8.0_144]
>       at 
> sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:224) 
> ~[?:1.8.0_144]
>       at 
> sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:212) 
> ~[?:1.8.0_144]
>       at 
> sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179) 
> ~[?:1.8.0_144]
>       at 
> com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:192)
>  ~[?:1.8.0_144]
>       ... 25 more}}
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to