[ 
https://issues.apache.org/jira/browse/FLINK-2977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15003790#comment-15003790
 ] 

ASF GitHub Bot commented on FLINK-2977:
---------------------------------------

Github user nielsbasjes commented on a diff in the pull request:

    https://github.com/apache/flink/pull/1342#discussion_r44765814
  
    --- Diff: flink-dist/src/main/flink-bin/bin/config.sh ---
    @@ -249,7 +249,15 @@ if [ -n "$HADOOP_HOME" ]; then
         fi
     fi
     
    
-INTERNAL_HADOOP_CLASSPATHS="$HADOOP_CLASSPATH:$HADOOP_CONF_DIR:$YARN_CONF_DIR"
    
+INTERNAL_HADOOP_CLASSPATHS="${HADOOP_CLASSPATH}:${HADOOP_CONF_DIR}:${YARN_CONF_DIR}"
    +
    +if [ -n "${HBASE_CONF_DIR}" ]; then
    +    # Setup the HBase classpath.
    +    INTERNAL_HADOOP_CLASSPATHS="${INTERNAL_HADOOP_CLASSPATHS}:`hbase 
classpath`"
    --- End diff --
    
    1) You see an error during the startup of your flink job
    2) An empty value is inserted in the path causing your homedirectory to be 
in the classpath.
    3) HBase support doesn't work.
    
    A quick test on the commandline:
    ```
    $ INTERNAL_HADOOP_CLASSPATHS=foo:bar
    $ INTERNAL_HADOOP_CLASSPATHS="${INTERNAL_HADOOP_CLASSPATHS}:`hbasexx 
classpath`"
    bash: hbasexx: command not found...
    $ echo $INTERNAL_HADOOP_CLASSPATHS 
    foo:bar:
    ```


> Cannot access HBase in a Kerberos secured Yarn cluster
> ------------------------------------------------------
>
>                 Key: FLINK-2977
>                 URL: https://issues.apache.org/jira/browse/FLINK-2977
>             Project: Flink
>          Issue Type: Bug
>          Components: YARN Client
>            Reporter: Niels Basjes
>            Assignee: Niels Basjes
>         Attachments: FLINK-2977-20151005-untested.patch, 
> FLINK-2977-20151009.patch
>
>
> I have created a very simple Flink topology consisting of a streaming Source 
> (the outputs the timestamp a few times per second) and a Sink (that puts that 
> timestamp into a single record in HBase).
> Running this on a non-secure Yarn cluster works fine.
> To run it on a secured Yarn cluster my main routine now looks like this:
> {code}
> public static void main(String[] args) throws Exception {
>     System.setProperty("java.security.krb5.conf", "/etc/krb5.conf");
>     UserGroupInformation.loginUserFromKeytab("[email protected]", 
> "/home/nbasjes/.krb/nbasjes.keytab");
>     final StreamExecutionEnvironment env = 
> StreamExecutionEnvironment.getExecutionEnvironment();
>     env.setParallelism(1);
>     DataStream<String> stream = env.addSource(new TimerTicksSource());
>     stream.addSink(new SetHBaseRowSink());
>     env.execute("Long running Flink application");
> }
> {code}
> When I run this 
>      flink run -m yarn-cluster -yn 1 -yjm 1024 -ytm 4096 
> ./kerberos-1.0-SNAPSHOT.jar
> I see after the startup messages:
> {quote}
> 17:13:24,466 INFO  org.apache.hadoop.security.UserGroupInformation            
>    - Login successful for user [email protected] using keytab file 
> /home/nbasjes/.krb/nbasjes.keytab
> 11/03/2015 17:13:25   Job execution switched to status RUNNING.
> 11/03/2015 17:13:25   Custom Source -> Stream Sink(1/1) switched to SCHEDULED 
> 11/03/2015 17:13:25   Custom Source -> Stream Sink(1/1) switched to DEPLOYING 
> 11/03/2015 17:13:25   Custom Source -> Stream Sink(1/1) switched to RUNNING 
> {quote}
> Which looks good.
> However ... no data goes into HBase.
> After some digging I found this error in the task managers log:
> {quote}
> 17:13:42,677 WARN  org.apache.hadoop.hbase.ipc.RpcClient                      
>    - Exception encountered while connecting to the server : 
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]
> 17:13:42,677 FATAL org.apache.hadoop.hbase.ipc.RpcClient                      
>    - SASL authentication failed. The most likely cause is missing or invalid 
> credentials. Consider 'kinit'.
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]
>       at 
> com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:212)
>       at 
> org.apache.hadoop.hbase.security.HBaseSaslRpcClient.saslConnect(HBaseSaslRpcClient.java:177)
>       at 
> org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupSaslConnection(RpcClient.java:815)
>       at 
> org.apache.hadoop.hbase.ipc.RpcClient$Connection.access$800(RpcClient.java:349)
> {quote}
> First starting a yarn-session and then loading my job gives the same error.
> My best guess at this point is that Flink needs the same fix as described 
> here:
> https://issues.apache.org/jira/browse/SPARK-6918   ( 
> https://github.com/apache/spark/pull/5586 )



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to