[ 
https://issues.apache.org/jira/browse/FLINK-2977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15000325#comment-15000325
 ] 

ASF GitHub Bot commented on FLINK-2977:
---------------------------------------

Github user StephanEwen commented on the pull request:

    https://github.com/apache/flink/pull/1342#issuecomment-155769960
  
    Robert has a good point. Right now, if some one uses HBase, the HBase 
dependency is part of the user program JAR. It would be nice to keep HBase out 
of the core JAR - it would be yet another fat dependency with multiple 
transitive dependencies.
    
    Once you change the code to reflect-load the HBase classes, one gets the 
Kerberos/HBase support as soon as one drops the HBase jars JAR into Flink's lib 
folder, or adds them to the Hadoop Classpath env variable. At the same time, 
non-HBase users retain a thinner set of dependencies (and possible conflicts).
    
    @nielsbasjes What do you think about that approach?


> Cannot access HBase in a Kerberos secured Yarn cluster
> ------------------------------------------------------
>
>                 Key: FLINK-2977
>                 URL: https://issues.apache.org/jira/browse/FLINK-2977
>             Project: Flink
>          Issue Type: Bug
>          Components: YARN Client
>            Reporter: Niels Basjes
>            Assignee: Niels Basjes
>         Attachments: FLINK-2977-20151005-untested.patch, 
> FLINK-2977-20151009.patch
>
>
> I have created a very simple Flink topology consisting of a streaming Source 
> (the outputs the timestamp a few times per second) and a Sink (that puts that 
> timestamp into a single record in HBase).
> Running this on a non-secure Yarn cluster works fine.
> To run it on a secured Yarn cluster my main routine now looks like this:
> {code}
> public static void main(String[] args) throws Exception {
>     System.setProperty("java.security.krb5.conf", "/etc/krb5.conf");
>     UserGroupInformation.loginUserFromKeytab("[email protected]", 
> "/home/nbasjes/.krb/nbasjes.keytab");
>     final StreamExecutionEnvironment env = 
> StreamExecutionEnvironment.getExecutionEnvironment();
>     env.setParallelism(1);
>     DataStream<String> stream = env.addSource(new TimerTicksSource());
>     stream.addSink(new SetHBaseRowSink());
>     env.execute("Long running Flink application");
> }
> {code}
> When I run this 
>      flink run -m yarn-cluster -yn 1 -yjm 1024 -ytm 4096 
> ./kerberos-1.0-SNAPSHOT.jar
> I see after the startup messages:
> {quote}
> 17:13:24,466 INFO  org.apache.hadoop.security.UserGroupInformation            
>    - Login successful for user [email protected] using keytab file 
> /home/nbasjes/.krb/nbasjes.keytab
> 11/03/2015 17:13:25   Job execution switched to status RUNNING.
> 11/03/2015 17:13:25   Custom Source -> Stream Sink(1/1) switched to SCHEDULED 
> 11/03/2015 17:13:25   Custom Source -> Stream Sink(1/1) switched to DEPLOYING 
> 11/03/2015 17:13:25   Custom Source -> Stream Sink(1/1) switched to RUNNING 
> {quote}
> Which looks good.
> However ... no data goes into HBase.
> After some digging I found this error in the task managers log:
> {quote}
> 17:13:42,677 WARN  org.apache.hadoop.hbase.ipc.RpcClient                      
>    - Exception encountered while connecting to the server : 
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]
> 17:13:42,677 FATAL org.apache.hadoop.hbase.ipc.RpcClient                      
>    - SASL authentication failed. The most likely cause is missing or invalid 
> credentials. Consider 'kinit'.
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]
>       at 
> com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:212)
>       at 
> org.apache.hadoop.hbase.security.HBaseSaslRpcClient.saslConnect(HBaseSaslRpcClient.java:177)
>       at 
> org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupSaslConnection(RpcClient.java:815)
>       at 
> org.apache.hadoop.hbase.ipc.RpcClient$Connection.access$800(RpcClient.java:349)
> {quote}
> First starting a yarn-session and then loading my job gives the same error.
> My best guess at this point is that Flink needs the same fix as described 
> here:
> https://issues.apache.org/jira/browse/SPARK-6918   ( 
> https://github.com/apache/spark/pull/5586 )



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to