Re: Spark + HBase + Kerberos

2015-03-18 Thread Eric Walk
Hi Ted,

The spark executors and hbase regions/masters are all collocated. This is a 2 
node test environment.

Best,
Eric

Eric Walk, Sr. Technical Consultant
p: 617.855.9255 |  NASDAQ: PRFT  |  Perficient.com<http://www.perficient.com/>








From: Ted Yu 
Sent: Mar 18, 2015 2:46 PM
To: Eric Walk
Cc: user@spark.apache.org;Bill Busch
Subject: Re: Spark + HBase + Kerberos

Are hbase config / keytab files deployed on executor machines ?

Consider adding -Dsun.security.krb5.debug=true for debug purpose.

Cheers

On Wed, Mar 18, 2015 at 11:39 AM, Eric Walk 
mailto:eric.w...@perficient.com>> wrote:
Having an issue connecting to HBase from a Spark container in a Secure Cluster. 
Haven’t been able to get past this issue, any thoughts would be appreciated.

We’re able to perform some operations like “CreateTable” in the driver thread 
successfully. Read requests (always in the executor threads) are always failing 
with:
No valid credentials provided (Mechanism level: Failed to find any Kerberos 
tgt)]

Logs and scala are attached, the names of the innocent have masked for their 
protection (in a consistent manner).

Executing the following spark job (using HDP 2.2, Spark 1.2.0, HBase 0.98.4, 
Kerberos on AD):
export 
SPARK_CLASSPATH=/usr/hdp/2.2.0.0-2041/hbase/lib/hbase-server.jar:/usr/hdp/2.2.0.0-2041/hbase/lib/hbase-protocol.jar:/usr/hdp/2.2.0.0-2041/hbase/lib/hbase-hadoop2-compat.jar:/usr/hdp/2.2.0.0-2041/hbase/lib/hbase-client.jar:/usr/hdp/2.2.0.0-2041/hbase/lib/hbase-common.jar:/usr/hdp/2.2.0.0-2041/hbase/lib/htrace-core-3.0.4.jar:/usr/hdp/2.2.0.0-2041/hbase/lib/guava-12.0.1.jar:/usr/hdp/2.2.0.0-2041/hbase/conf

/usr/hdp/2.2.0.0-2041/spark/bin/spark-submit --class HBaseTest --driver-memory 
2g --executor-memory 1g --executor-cores 1 --num-executors 1 --master 
yarn-client ~/spark-test_2.10-1.0.jar

We see this error in the executor processes (attached as yarn log.txt):
2015-03-18 17:34:15,121 DEBUG [Executor task launch worker-0] 
security.HBaseSaslRpcClient: Creating SASL GSSAPI client. Server's Kerberos 
principal name is hbase/ldevawshdp0002..pvc@.PVC
2015-03-18 17:34:15,128 WARN  [Executor task launch worker-0] ipc.RpcClient: 
Exception encountered while connecting to the server : 
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: 
No valid credentials provided (Mechanism level: Failed to find any Kerberos 
tgt)]
2015-03-18 17:34:15,129 ERROR [Executor task launch worker-0] ipc.RpcClient: 
SASL authentication failed. The most likely cause is missing or invalid 
credentials. Consider 'kinit'.
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: 
No valid credentials provided (Mechanism level: Failed to find any Kerberos 
tgt)]

The HBase Master Logs show success:
2015-03-18 17:34:12,861 DEBUG [RpcServer.listener,port=6] ipc.RpcServer: 
RpcServer.listener,port=6: connection from 
10.4.0.6:46636<http://10.4.0.6:46636>; # active connections: 3
2015-03-18 17:34:12,872 DEBUG [RpcServer.reader=3,port=6] ipc.RpcServer: 
Kerberos principal name is hbase/ldevawshdp0001..pvc@.PVC
2015-03-18 17:34:12,875 DEBUG [RpcServer.reader=3,port=6] ipc.RpcServer: 
Created SASL server with mechanism = GSSAPI
2015-03-18 17:34:12,875 DEBUG [RpcServer.reader=3,port=6] ipc.RpcServer: 
Have read input token of size 1501 for processing by 
saslServer.evaluateResponse()
2015-03-18 17:34:12,876 DEBUG [RpcServer.reader=3,port=6] ipc.RpcServer: 
Will send token of size 108 from saslServer.
2015-03-18 17:34:12,877 DEBUG [RpcServer.reader=3,port=6] ipc.RpcServer: 
Have read input token of size 0 for processing by saslServer.evaluateResponse()
2015-03-18 17:34:12,878 DEBUG [RpcServer.reader=3,port=6] ipc.RpcServer: 
Will send token of size 32 from saslServer.
2015-03-18 17:34:12,878 DEBUG [RpcServer.reader=3,port=6] ipc.RpcServer: 
Have read input token of size 32 for processing by saslServer.evaluateResponse()
2015-03-18 17:34:12,879 DEBUG [RpcServer.reader=3,port=6] 
security.HBaseSaslRpcServer: SASL server GSSAPI callback: setting canonicalized 
client ID: @.PVC
2015-03-18 17:34:12,895 DEBUG [RpcServer.reader=3,port=6] ipc.RpcServer: 
SASL server context established. Authenticated client: @.PVC 
(auth:SIMPLE). Negotiated QoP is auth
2015-03-18 17:34:29,313 DEBUG [RpcServer.reader=3,port=6] ipc.RpcServer: 
RpcServer.listener,port=6: DISCONNECTING client 
10.4.0.6:46636<http://10.4.0.6:46636> because read count=-1. Number of active 
connections: 3
2015-03-18 17:34:37,102 DEBUG [RpcServer.listener,port=6] ipc.RpcServer: 
RpcServer.listener,port=6: connection from 
10.4.0.6:46733<http://10.4.0.6:46733>; # active connections: 3
2015-03-18 17:34:37,102 DEBUG [RpcServer.reader=4,port=6] ipc.RpcServer: 
RpcServer.listener,port=6: DISCONNECTING client 
10.4.0.6:46733<http://10.4.0.6:46733> because read count=-1. Number of active 
connections: 3

The Spark Driver Console 

Re: Spark + HBase + Kerberos

2015-03-18 Thread Ted Yu
Are hbase config / keytab files deployed on executor machines ?

Consider adding -Dsun.security.krb5.debug=true for debug purpose.

Cheers

On Wed, Mar 18, 2015 at 11:39 AM, Eric Walk 
wrote:

>  Having an issue connecting to HBase from a Spark container in a Secure
> Cluster. Haven’t been able to get past this issue, any thoughts would be
> appreciated.
>
>
>
> We’re able to perform some operations like “CreateTable” in the driver
> thread successfully. Read requests (always in the executor threads) are
> always failing with:
>
> No valid credentials provided (Mechanism level: Failed to find any
> Kerberos tgt)]
>
>
>
> Logs and scala are attached, the names of the innocent have masked for
> their protection (in a consistent manner).
>
>
>
> Executing the following spark job (using HDP 2.2, Spark 1.2.0, HBase
> 0.98.4, Kerberos on AD):
>
> export
> SPARK_CLASSPATH=/usr/hdp/2.2.0.0-2041/hbase/lib/hbase-server.jar:/usr/hdp/2.2.0.0-2041/hbase/lib/hbase-protocol.jar:/usr/hdp/2.2.0.0-2041/hbase/lib/hbase-hadoop2-compat.jar:/usr/hdp/2.2.0.0-2041/hbase/lib/hbase-client.jar:/usr/hdp/2.2.0.0-2041/hbase/lib/hbase-common.jar:/usr/hdp/2.2.0.0-2041/hbase/lib/htrace-core-3.0.4.jar:/usr/hdp/2.2.0.0-2041/hbase/lib/guava-12.0.1.jar:/usr/hdp/2.2.0.0-2041/hbase/conf
>
>
>
> /usr/hdp/2.2.0.0-2041/spark/bin/spark-submit --class HBaseTest
> --driver-memory 2g --executor-memory 1g --executor-cores 1 --num-executors
> 1 --master yarn-client ~/spark-test_2.10-1.0.jar
>
>
>
> We see this error in the executor processes (attached as yarn log.txt):
>
> 2015-03-18 17:34:15,121 DEBUG [Executor task launch worker-0]
> security.HBaseSaslRpcClient: Creating SASL GSSAPI client. Server's Kerberos
> principal name is hbase/ldevawshdp0002..pvc@.PVC
>
> 2015-03-18 17:34:15,128 WARN  [Executor task launch worker-0]
> ipc.RpcClient: Exception encountered while connecting to the server :
> javax.security.sasl.SaslException: GSS initiate failed [Caused by
> GSSException: No valid credentials provided (Mechanism level: Failed to
> find any Kerberos tgt)]
>
> 2015-03-18 17:34:15,129 ERROR [Executor task launch worker-0]
> ipc.RpcClient: SASL authentication failed. The most likely cause is missing
> or invalid credentials. Consider 'kinit'.
>
> javax.security.sasl.SaslException: GSS initiate failed [Caused by
> GSSException: No valid credentials provided (Mechanism level: Failed to
> find any Kerberos tgt)]
>
>
>
> The HBase Master Logs show success:
>
> 2015-03-18 17:34:12,861 DEBUG [RpcServer.listener,port=6]
> ipc.RpcServer: RpcServer.listener,port=6: connection from
> 10.4.0.6:46636; # active connections: 3
>
> 2015-03-18 17:34:12,872 DEBUG [RpcServer.reader=3,port=6]
> ipc.RpcServer: Kerberos principal name is hbase/ldevawshdp0001..pvc@
> .PVC
>
> 2015-03-18 17:34:12,875 DEBUG [RpcServer.reader=3,port=6]
> ipc.RpcServer: Created SASL server with mechanism = GSSAPI
>
> 2015-03-18 17:34:12,875 DEBUG [RpcServer.reader=3,port=6]
> ipc.RpcServer: Have read input token of size 1501 for processing by
> saslServer.evaluateResponse()
>
> 2015-03-18 17:34:12,876 DEBUG [RpcServer.reader=3,port=6]
> ipc.RpcServer: Will send token of size 108 from saslServer.
>
> 2015-03-18 17:34:12,877 DEBUG [RpcServer.reader=3,port=6]
> ipc.RpcServer: Have read input token of size 0 for processing by
> saslServer.evaluateResponse()
>
> 2015-03-18 17:34:12,878 DEBUG [RpcServer.reader=3,port=6]
> ipc.RpcServer: Will send token of size 32 from saslServer.
>
> 2015-03-18 17:34:12,878 DEBUG [RpcServer.reader=3,port=6]
> ipc.RpcServer: Have read input token of size 32 for processing by
> saslServer.evaluateResponse()
>
> 2015-03-18 17:34:12,879 DEBUG [RpcServer.reader=3,port=6]
> security.HBaseSaslRpcServer: SASL server GSSAPI callback: setting
> canonicalized client ID: @.PVC
>
> 2015-03-18 17:34:12,895 DEBUG [RpcServer.reader=3,port=6]
> ipc.RpcServer: SASL server context established. Authenticated client:
> @.PVC (auth:SIMPLE). Negotiated QoP is auth
>
> 2015-03-18 17:34:29,313 DEBUG [RpcServer.reader=3,port=6]
> ipc.RpcServer: RpcServer.listener,port=6: DISCONNECTING client
> 10.4.0.6:46636 because read count=-1. Number of active connections: 3
>
> 2015-03-18 17:34:37,102 DEBUG [RpcServer.listener,port=6]
> ipc.RpcServer: RpcServer.listener,port=6: connection from
> 10.4.0.6:46733; # active connections: 3
>
> 2015-03-18 17:34:37,102 DEBUG [RpcServer.reader=4,port=6]
> ipc.RpcServer: RpcServer.listener,port=6: DISCONNECTING client
> 10.4.0.6:46733 because read count=-1. Number of active connections: 3
>
>
>
> The Spark Driver Console Output hangs at this point:
>
> 2015-03-18 17:34:13,337 INFO  [main] spark.DefaultExecutionContext:
> Starting job: count at HBaseTest.scala:63
>
> 2015-03-18 17:34:13,349 INFO
> [sparkDriver-akka.actor.default-dispatcher-4] scheduler.DAGScheduler: Got
> job 0 (count at HBaseTest.scala:63) with 1 output partitions
> (allowLocal=false)
>
> 2015-03-18 17:34:13,350 INFO
> [sparkDri