Hi Aneela, My (little to no) understanding of how to make it work is to use hbase.security.authentication property set to kerberos (see [1]).
Spark on YARN uses it to get the tokens for Hive, HBase et al (see [2]). It happens when Client starts conversation to YARN RM (see [3]). You should not do that yourself (and BTW you've got a typo in spark.yarn.security.tokens.habse.enabled setting). I think that the entire code you pasted matches the code Spark's doing itself before requesting resources from YARN. Give it a shot and report back since I've never worked in such a configuration and would love improving in this (security) area. Thanks! [1] http://www.cloudera.com/documentation/enterprise/5-5-x/topics/cdh_sg_hbase_authentication.html#concept_zyz_vg5_nt__section_s1l_nwv_ls [2] https://github.com/apache/spark/blob/master/yarn/src/main/scala/org/apache/spark/deploy/yarn/security/HBaseCredentialProvider.scala#L58 [3] https://github.com/apache/spark/blob/master/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala#L396 Pozdrawiam, Jacek Laskowski ---- https://medium.com/@jaceklaskowski/ Mastering Apache Spark 2.0 http://bit.ly/mastering-apache-spark Follow me at https://twitter.com/jaceklaskowski On Fri, Aug 12, 2016 at 11:30 PM, Aneela Saleem <ane...@platalytics.com> wrote: > Thanks for your response Jacek! > > Here is the code, how spark accesses HBase: > System.setProperty("java.security.krb5.conf", "/etc/krb5.conf"); > System.setProperty("java.security.auth.login.config", > "/etc/hbase/conf/zk-jaas.conf"); > val hconf = HBaseConfiguration.create() > val tableName = "emp" > hconf.set("hbase.zookeeper.quorum", "hadoop-master") > hconf.set(TableInputFormat.INPUT_TABLE, tableName) > hconf.set("hbase.zookeeper.property.clientPort", "2181") > hconf.set("hbase.master", "hadoop-master:60000") > hconf.set("hadoop.security.authentication", "kerberos") > hconf.set("hbase.security.authentication", "kerberos") > hconf.addResource(new Path("/etc/hbase/conf/core-site.xml")) > hconf.addResource(new Path("/etc/hbase/conf/hbase-site.xml")) > UserGroupInformation.setConfiguration(hconf) > UserGroupInformation.loginUserFromKeytab("spark@platalyticsrealm", > "/etc/hadoop/conf/sp.keytab") > conf.set("spark.yarn.security.tokens.habse.enabled", "true") > conf.set("hadoop.security.authentication", "true") > conf.set("hbase.security.authentication", "true") > conf.set("spark.authenticate", "true") > conf.set("spark.authenticate.secret","None") > val sc = new SparkContext(conf) > UserGroupInformation.setConfiguration(hconf) > val keyTab = "/etc/hadoop/conf/sp.keytab" > val ugi = > UserGroupInformation.loginUserFromKeytabAndReturnUGI("spark/hadoop-master@platalyticsrealm", > keyTab) > UserGroupInformation.setLoginUser(ugi) > HBaseAdmin.checkHBaseAvailable(hconf); > ugi.doAs(new PrivilegedExceptionAction[Void]() { > override def run(): Void = { > val conf = new SparkConf().set("spark.shuffle.consolidateFiles", "true") > > val sc = new SparkContext(conf) > val hbaseContext = new HBaseContext(sc, hconf) > > val scan = new Scan() > scan.addColumn(columnName, "column1") > scan.setTimeRange(0L, 1416083300000L) > val rdd = hbaseContext.hbaseRDD("emp", scan) > println(rdd.count) > rdd.saveAsTextFile("hdfs://hadoop-master:8020/hbaseTemp/") > sc.stop() > return null > } > }) > I have tried it with both Spark versions, 20 and 1.5.3 but same exception > was thrown. > > I floated this email on HBase community as well, they recommended me to use > SparkOnHbase cloudera library. And asked to try the above cod but nothing > works. I'm stuck here. > > > On Sat, Aug 13, 2016 at 7:07 AM, Jacek Laskowski <ja...@japila.pl> wrote: >> >> Hi, >> >> How do you access HBase? What's the version of Spark? >> >> (I don't see spark packages in the stack trace) >> >> Pozdrawiam, >> Jacek Laskowski >> ---- >> https://medium.com/@jaceklaskowski/ >> Mastering Apache Spark 2.0 http://bit.ly/mastering-apache-spark >> Follow me at https://twitter.com/jaceklaskowski >> >> >> On Sun, Aug 7, 2016 at 9:02 AM, Aneela Saleem <ane...@platalytics.com> >> wrote: >> > Hi all, >> > >> > I'm trying to run a spark job that accesses HBase with security enabled. >> > When i run the following command: >> > >> > /usr/local/spark-2/bin/spark-submit --keytab >> > /etc/hadoop/conf/spark.keytab >> > --principal spark/hadoop-master@platalyticsrealm --class >> > com.platalytics.example.spark.App --master yarn --driver-class-path >> > /root/hbase-1.2.2/conf /home/vm6/project-1-jar-with-dependencies.jar >> > >> > >> > I get the following error: >> > >> > >> > 2016-08-07 20:43:57,617 WARN >> > [hconnection-0x24b5fa45-metaLookup-shared--pool2-t1] ipc.RpcClientImpl: >> > Exception encountered while connecting to the server : >> > javax.security.sasl.SaslException: GSS initiate failed [Caused by >> > GSSException: No valid credentials provided (Mechanism level: Failed to >> > find >> > any Kerberos tgt)] >> > 2016-08-07 20:43:57,619 ERROR >> > [hconnection-0x24b5fa45-metaLookup-shared--pool2-t1] ipc.RpcClientImpl: >> > SASL >> > authentication failed. The most likely cause is missing or invalid >> > credentials. Consider 'kinit'. >> > javax.security.sasl.SaslException: GSS initiate failed [Caused by >> > GSSException: No valid credentials provided (Mechanism level: Failed to >> > find >> > any Kerberos tgt)] >> > at >> > >> > com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:212) >> > at >> > >> > org.apache.hadoop.hbase.security.HBaseSaslRpcClient.saslConnect(HBaseSaslRpcClient.java:179) >> > at >> > >> > org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupSaslConnection(RpcClientImpl.java:617) >> > at >> > >> > org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.access$700(RpcClientImpl.java:162) >> > at >> > >> > org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$2.run(RpcClientImpl.java:743) >> > at >> > >> > org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$2.run(RpcClientImpl.java:740) >> > at java.security.AccessController.doPrivileged(Native Method) >> > at javax.security.auth.Subject.doAs(Subject.java:415) >> > at >> > >> > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) >> > at >> > >> > org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupIOstreams(RpcClientImpl.java:740) >> > at >> > >> > org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.writeRequest(RpcClientImpl.java:906) >> > at >> > >> > org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.tracedWriteRequest(RpcClientImpl.java:873) >> > at >> > org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1241) >> > at >> > >> > org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:227) >> > at >> > >> > org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:336) >> > at >> > >> > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(ClientProtos.java:34094) >> > at >> > >> > org.apache.hadoop.hbase.client.ClientSmallScanner$SmallScannerCallable.call(ClientSmallScanner.java:201) >> > at >> > >> > org.apache.hadoop.hbase.client.ClientSmallScanner$SmallScannerCallable.call(ClientSmallScanner.java:180) >> > at >> > >> > org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:210) >> > at >> > >> > org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:360) >> > at >> > >> > org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:334) >> > at >> > >> > org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:136) >> > at >> > >> > org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture.run(ResultBoundedCompletionService.java:65) >> > at >> > >> > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> > at >> > >> > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> > at java.lang.Thread.run(Thread.java:745) >> > Caused by: GSSException: No valid credentials provided (Mechanism level: >> > Failed to find any Kerberos tgt) >> > at >> > >> > sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:147) >> > at >> > >> > sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:121) >> > at >> > >> > sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:187) >> > at >> > >> > sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:223) >> > at >> > sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:212) >> > at >> > sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179) >> > at >> > >> > com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:193) >> > ... 25 more >> > >> > >> > I have Spark running on Yarn with security enabled. I have kinit'd from >> > console and have provided necessarry principals and keytabs. Can you >> > please >> > help me find out the issue? >> > >> > >> > Thanks > > --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org