Hi Aneela,

My (little to no) understanding of how to make it work is to use
hbase.security.authentication property set to kerberos (see [1]).

Spark on YARN uses it to get the tokens for Hive, HBase et al (see
[2]). It happens when Client starts conversation to YARN RM (see [3]).

You should not do that yourself (and BTW you've got a typo in
spark.yarn.security.tokens.habse.enabled setting). I think that the
entire code you pasted matches the code Spark's doing itself before
requesting resources from YARN.

Give it a shot and report back since I've never worked in such a
configuration and would love improving in this (security) area.
Thanks!

[1] 
http://www.cloudera.com/documentation/enterprise/5-5-x/topics/cdh_sg_hbase_authentication.html#concept_zyz_vg5_nt__section_s1l_nwv_ls
[2] 
https://github.com/apache/spark/blob/master/yarn/src/main/scala/org/apache/spark/deploy/yarn/security/HBaseCredentialProvider.scala#L58
[3] 
https://github.com/apache/spark/blob/master/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala#L396

Pozdrawiam,
Jacek Laskowski
----
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2.0 http://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski


On Fri, Aug 12, 2016 at 11:30 PM, Aneela Saleem <ane...@platalytics.com> wrote:
> Thanks for your response Jacek!
>
> Here is the code, how spark accesses HBase:
> System.setProperty("java.security.krb5.conf", "/etc/krb5.conf");
> System.setProperty("java.security.auth.login.config",
> "/etc/hbase/conf/zk-jaas.conf");
> val hconf = HBaseConfiguration.create()
> val tableName = "emp"
> hconf.set("hbase.zookeeper.quorum", "hadoop-master")
> hconf.set(TableInputFormat.INPUT_TABLE, tableName)
> hconf.set("hbase.zookeeper.property.clientPort", "2181")
> hconf.set("hbase.master", "hadoop-master:60000")
> hconf.set("hadoop.security.authentication", "kerberos")
> hconf.set("hbase.security.authentication", "kerberos")
> hconf.addResource(new Path("/etc/hbase/conf/core-site.xml"))
> hconf.addResource(new Path("/etc/hbase/conf/hbase-site.xml"))
> UserGroupInformation.setConfiguration(hconf)
> UserGroupInformation.loginUserFromKeytab("spark@platalyticsrealm",
> "/etc/hadoop/conf/sp.keytab")
> conf.set("spark.yarn.security.tokens.habse.enabled", "true")
> conf.set("hadoop.security.authentication", "true")
> conf.set("hbase.security.authentication", "true")
> conf.set("spark.authenticate", "true")
> conf.set("spark.authenticate.secret","None")
> val sc = new SparkContext(conf)
>                 UserGroupInformation.setConfiguration(hconf)
> val keyTab = "/etc/hadoop/conf/sp.keytab"
> val ugi =
> UserGroupInformation.loginUserFromKeytabAndReturnUGI("spark/hadoop-master@platalyticsrealm",
> keyTab)
> UserGroupInformation.setLoginUser(ugi)
> HBaseAdmin.checkHBaseAvailable(hconf);
> ugi.doAs(new PrivilegedExceptionAction[Void]() {
> override def run(): Void = {
> val conf = new SparkConf().set("spark.shuffle.consolidateFiles", "true")
>
> val sc = new SparkContext(conf)
> val hbaseContext = new HBaseContext(sc, hconf)
>
> val scan = new Scan()
> scan.addColumn(columnName, "column1")
> scan.setTimeRange(0L, 1416083300000L)
> val rdd = hbaseContext.hbaseRDD("emp", scan)
> println(rdd.count)
> rdd.saveAsTextFile("hdfs://hadoop-master:8020/hbaseTemp/")
> sc.stop()
> return null
> }
> })
> I have tried it with both Spark versions, 20 and 1.5.3 but same exception
> was thrown.
>
> I floated this email on HBase community as well, they recommended me to use
> SparkOnHbase cloudera library. And asked to try the above cod but nothing
> works. I'm stuck here.
>
>
> On Sat, Aug 13, 2016 at 7:07 AM, Jacek Laskowski <ja...@japila.pl> wrote:
>>
>> Hi,
>>
>> How do you access HBase? What's the version of Spark?
>>
>> (I don't see spark packages in the stack trace)
>>
>> Pozdrawiam,
>> Jacek Laskowski
>> ----
>> https://medium.com/@jaceklaskowski/
>> Mastering Apache Spark 2.0 http://bit.ly/mastering-apache-spark
>> Follow me at https://twitter.com/jaceklaskowski
>>
>>
>> On Sun, Aug 7, 2016 at 9:02 AM, Aneela Saleem <ane...@platalytics.com>
>> wrote:
>> > Hi all,
>> >
>> > I'm trying to run a spark job that accesses HBase with security enabled.
>> > When i run the following command:
>> >
>> > /usr/local/spark-2/bin/spark-submit --keytab
>> > /etc/hadoop/conf/spark.keytab
>> > --principal spark/hadoop-master@platalyticsrealm --class
>> > com.platalytics.example.spark.App --master yarn  --driver-class-path
>> > /root/hbase-1.2.2/conf /home/vm6/project-1-jar-with-dependencies.jar
>> >
>> >
>> > I get the following error:
>> >
>> >
>> > 2016-08-07 20:43:57,617 WARN
>> > [hconnection-0x24b5fa45-metaLookup-shared--pool2-t1] ipc.RpcClientImpl:
>> > Exception encountered while connecting to the server :
>> > javax.security.sasl.SaslException: GSS initiate failed [Caused by
>> > GSSException: No valid credentials provided (Mechanism level: Failed to
>> > find
>> > any Kerberos tgt)]
>> > 2016-08-07 20:43:57,619 ERROR
>> > [hconnection-0x24b5fa45-metaLookup-shared--pool2-t1] ipc.RpcClientImpl:
>> > SASL
>> > authentication failed. The most likely cause is missing or invalid
>> > credentials. Consider 'kinit'.
>> > javax.security.sasl.SaslException: GSS initiate failed [Caused by
>> > GSSException: No valid credentials provided (Mechanism level: Failed to
>> > find
>> > any Kerberos tgt)]
>> >       at
>> >
>> > com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:212)
>> >       at
>> >
>> > org.apache.hadoop.hbase.security.HBaseSaslRpcClient.saslConnect(HBaseSaslRpcClient.java:179)
>> >       at
>> >
>> > org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupSaslConnection(RpcClientImpl.java:617)
>> >       at
>> >
>> > org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.access$700(RpcClientImpl.java:162)
>> >       at
>> >
>> > org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$2.run(RpcClientImpl.java:743)
>> >       at
>> >
>> > org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$2.run(RpcClientImpl.java:740)
>> >       at java.security.AccessController.doPrivileged(Native Method)
>> >       at javax.security.auth.Subject.doAs(Subject.java:415)
>> >       at
>> >
>> > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>> >       at
>> >
>> > org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupIOstreams(RpcClientImpl.java:740)
>> >       at
>> >
>> > org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.writeRequest(RpcClientImpl.java:906)
>> >       at
>> >
>> > org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.tracedWriteRequest(RpcClientImpl.java:873)
>> >       at
>> > org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1241)
>> >       at
>> >
>> > org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:227)
>> >       at
>> >
>> > org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:336)
>> >       at
>> >
>> > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(ClientProtos.java:34094)
>> >       at
>> >
>> > org.apache.hadoop.hbase.client.ClientSmallScanner$SmallScannerCallable.call(ClientSmallScanner.java:201)
>> >       at
>> >
>> > org.apache.hadoop.hbase.client.ClientSmallScanner$SmallScannerCallable.call(ClientSmallScanner.java:180)
>> >       at
>> >
>> > org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:210)
>> >       at
>> >
>> > org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:360)
>> >       at
>> >
>> > org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:334)
>> >       at
>> >
>> > org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:136)
>> >       at
>> >
>> > org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture.run(ResultBoundedCompletionService.java:65)
>> >       at
>> >
>> > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>> >       at
>> >
>> > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>> >       at java.lang.Thread.run(Thread.java:745)
>> > Caused by: GSSException: No valid credentials provided (Mechanism level:
>> > Failed to find any Kerberos tgt)
>> >       at
>> >
>> > sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:147)
>> >       at
>> >
>> > sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:121)
>> >       at
>> >
>> > sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:187)
>> >       at
>> >
>> > sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:223)
>> >       at
>> > sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:212)
>> >       at
>> > sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179)
>> >       at
>> >
>> > com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:193)
>> >       ... 25 more
>> >
>> >
>> > I have Spark running on Yarn with security enabled. I have kinit'd from
>> > console and have provided necessarry principals and keytabs. Can you
>> > please
>> > help me find out the issue?
>> >
>> >
>> > Thanks
>
>

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to