Can you share the exception(s) you encountered ? Thanks
> On May 22, 2015, at 12:33 AM, donhoff_h <165612...@qq.com> wrote: > > Hi, > > My modified code is listed below, just add the SecurityUtil API. I don't > know which propertyKeys I should use, so I make 2 my own propertyKeys to find > the keytab and principal. > > object TestHBaseRead2 { > def main(args: Array[String]) { > > val conf = new SparkConf() > val sc = new SparkContext(conf) > val hbConf = HBaseConfiguration.create() > hbConf.set("dhao.keytab.file","//etc//spark//keytab//spark.user.keytab") > hbConf.set("dhao.user.principal","sp...@bgdt.dev.hrb") > SecurityUtil.login(hbConf,"dhao.keytab.file","dhao.user.principal") > val conn = ConnectionFactory.createConnection(hbConf) > val tbl = conn.getTable(TableName.valueOf("spark_t01")) > try { > val get = new Get(Bytes.toBytes("row01")) > val res = tbl.get(get) > println("result:"+res.toString) > } > finally { > tbl.close() > conn.close() > es.shutdown() > } > > val rdd = sc.parallelize(Array(1,2,3,4,5,6,7,8,9,10)) > val v = rdd.sum() > println("Value="+v) > sc.stop() > > } > } > > > ------------------ 原始邮件 ------------------ > 发件人: "yuzhihong";<yuzhih...@gmail.com>; > 发送时间: 2015年5月22日(星期五) 下午3:25 > 收件人: "donhoff_h"<165612...@qq.com>; > 抄送: "Bill Q"<bill.q....@gmail.com>; "user"<user@spark.apache.org>; > 主题: Re: 回复: How to use spark to access HBase with Security enabled > > Can you post the morning modified code ? > > Thanks > > > >> On May 21, 2015, at 11:11 PM, donhoff_h <165612...@qq.com> wrote: >> >> Hi, >> >> Thanks very much for the reply. I have tried the "SecurityUtil". I can see >> from log that this statement executed successfully, but I still can not pass >> the authentication of HBase. And with more experiments, I found a new >> interesting senario. If I run the program with yarn-client mode, the driver >> can pass the authentication, but the executors can not. If I run the program >> with yarn-cluster mode, both the driver and the executors can not pass the >> authentication. Can anybody give me some clue with this info? Many Thanks! >> >> >> ------------------ 原始邮件 ------------------ >> 发件人: "yuzhihong";<yuzhih...@gmail.com>; >> 发送时间: 2015年5月22日(星期五) 凌晨5:29 >> 收件人: "donhoff_h"<165612...@qq.com>; >> 抄送: "Bill Q"<bill.q....@gmail.com>; "user"<user@spark.apache.org>; >> 主题: Re: How to use spark to access HBase with Security enabled >> >> Are the worker nodes colocated with HBase region servers ? >> >> Were you running as hbase super user ? >> >> You may need to login, using code similar to the following: >> if (isSecurityEnabled()) { >> >> SecurityUtil.login(conf, fileConfKey, principalConfKey, localhost); >> >> } >> >> SecurityUtil is hadoop class. >> >> >> >> Cheers >> >> >>> On Thu, May 21, 2015 at 1:58 AM, donhoff_h <165612...@qq.com> wrote: >>> Hi, >>> >>> Many thanks for the help. My Spark version is 1.3.0 too and I run it on >>> Yarn. According to your advice I have changed the configuration. Now my >>> program can read the hbase-site.xml correctly. And it can also authenticate >>> with zookeeper successfully. >>> >>> But I meet a new problem that is my program still can not pass the >>> authentication of HBase. Did you or anybody else ever meet such kind of >>> situation ? I used a keytab file to provide the principal. Since it can >>> pass the authentication of the Zookeeper, I am sure the keytab file is OK. >>> But it jsut can not pass the authentication of HBase. The exception is >>> listed below and could you or anybody else help me ? Still many many thanks! >>> >>> ****************************Exception*************************** >>> 15/05/21 16:03:18 INFO zookeeper.ZooKeeper: Initiating client connection, >>> connectString=bgdt02.dev.hrb:2181,bgdt01.dev.hrb:2181,bgdt03.dev.hrb:2181 >>> sessionTimeout=90000 watcher=hconnection-0x4e142a710x0, >>> quorum=bgdt02.dev.hrb:2181,bgdt01.dev.hrb:2181,bgdt03.dev.hrb:2181, >>> baseZNode=/hbase >>> 15/05/21 16:03:18 INFO zookeeper.Login: successfully logged in. >>> 15/05/21 16:03:18 INFO zookeeper.Login: TGT refresh thread started. >>> 15/05/21 16:03:18 INFO client.ZooKeeperSaslClient: Client will use GSSAPI >>> as SASL mechanism. >>> 15/05/21 16:03:18 INFO zookeeper.ClientCnxn: Opening socket connection to >>> server bgdt02.dev.hrb/130.1.9.98:2181. Will attempt to SASL-authenticate >>> using Login Context section 'Client' >>> 15/05/21 16:03:18 INFO zookeeper.ClientCnxn: Socket connection established >>> to bgdt02.dev.hrb/130.1.9.98:2181, initiating session >>> 15/05/21 16:03:18 INFO zookeeper.Login: TGT valid starting at: Thu >>> May 21 16:03:18 CST 2015 >>> 15/05/21 16:03:18 INFO zookeeper.Login: TGT expires: Fri >>> May 22 16:03:18 CST 2015 >>> 15/05/21 16:03:18 INFO zookeeper.Login: TGT refresh sleeping until: Fri May >>> 22 11:43:32 CST 2015 >>> 15/05/21 16:03:18 INFO zookeeper.ClientCnxn: Session establishment complete >>> on server bgdt02.dev.hrb/130.1.9.98:2181, sessionid = 0x24d46cb0ffd0020, >>> negotiated timeout = 40000 >>> 15/05/21 16:03:18 WARN mapreduce.TableInputFormatBase: initializeTable >>> called multiple times. Overwriting connection and table reference; >>> TableInputFormatBase will not close these old references when done. >>> 15/05/21 16:03:19 INFO util.RegionSizeCalculator: Calculating region sizes >>> for table "ns_dev1:hd01". >>> 15/05/21 16:03:19 WARN ipc.AbstractRpcClient: Exception encountered while >>> connecting to the server : javax.security.sasl.SaslException: GSS initiate >>> failed [Caused by GSSException: No valid credentials provided (Mechanism >>> level: Failed to find any Kerberos tgt)] >>> 15/05/21 16:03:19 ERROR ipc.AbstractRpcClient: SASL authentication failed. >>> The most likely cause is missing or invalid credentials. Consider 'kinit'. >>> javax.security.sasl.SaslException: GSS initiate failed [Caused by >>> GSSException: No valid credentials provided (Mechanism level: Failed to >>> find any Kerberos tgt)] >>> at >>> com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:212) >>> at >>> org.apache.hadoop.hbase.security.HBaseSaslRpcClient.saslConnect(HBaseSaslRpcClient.java:179) >>> at >>> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupSaslConnection(RpcClientImpl.java:604) >>> at >>> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.access$600(RpcClientImpl.java:153) >>> at >>> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$2.run(RpcClientImpl.java:730) >>> at >>> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$2.run(RpcClientImpl.java:727) >>> at java.security.AccessController.doPrivileged(Native >>> Method) >>> at javax.security.auth.Subject.doAs(Subject.java:415) >>> at >>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) >>> at >>> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupIOstreams(RpcClientImpl.java:727) >>> at >>> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.writeRequest(RpcClientImpl.java:880) >>> at >>> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.tracedWriteRequest(RpcClientImpl.java:849) >>> at >>> org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1173) >>> at >>> org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:216) >>> at >>> org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:300) >>> at >>> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(ClientProtos.java:31751) >>> at >>> org.apache.hadoop.hbase.client.ScannerCallable.openScanner(ScannerCallable.java:332) >>> at >>> org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:187) >>> at >>> org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:62) >>> at >>> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:126) >>> at >>> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:294) >>> at >>> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:275) >>> at java.util.concurrent.FutureTask.run(FutureTask.java:262) >>> at >>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >>> at >>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >>> at java.lang.Thread.run(Thread.java:745) >>> >>> ***********************I aslo list my codes as below if someone can give me >>> some advice from it************************* >>> object TestHBaseRead { >>> def main(args: Array[String]) { >>> val conf = new SparkConf() >>> val sc = new SparkContext(conf) >>> val hbConf = HBaseConfiguration.create(sc.hadoopConfiguration) >>> val tbName = if(args.length==1) args(0) else "ns_dev1:hd01" >>> hbConf.set(TableInputFormat.INPUT_TABLE,tbName) >>> //I print the content of hbConf to check if it read the correct >>> hbase-site.xml >>> val it = hbConf.iterator() >>> while(it.hasNext) { >>> val e = it.next() >>> println("Key="+ e.getKey +" Value="+e.getValue) >>> } >>> >>> val rdd = >>> sc.newAPIHadoopRDD(hbConf,classOf[TableInputFormat],classOf[ImmutableBytesWritable],classOf[Result]) >>> rdd.foreach(x=>{ >>> val key = x._1.toString >>> val it = x._2.listCells().iterator() >>> while(it.hasNext) { >>> val c = it.next() >>> val family = Bytes.toString(CellUtil.cloneFamily(c)) >>> val qualifier = Bytes.toString(CellUtil.cloneQualifier(c)) >>> val value = Bytes.toString(CellUtil.cloneValue(c)) >>> val tm = c.getTimestamp >>> println("Key="+key+" Family="+family+" Qualifier="+qualifier+" >>> Value="+value+" TimeStamp="+tm) >>> } >>> }) >>> sc.stop() >>> } >>> } >>> >>> ***************************I used the following command to run my >>> program********************** >>> spark-submit --class dhao.test.read.singleTable.TestHBaseRead --master >>> yarn-cluster --driver-java-options >>> "-Djava.security.auth.login.config=/home/spark/spark-hbase.jaas >>> -Djava.security.krb5.conf=/etc/krb5.conf" --conf >>> spark.executor.extraJavaOptions="-Djava.security.auth.login.config=/home/spark/spark-hbase.jaas >>> -Djava.security.krb5.conf=/etc/krb5.conf" /home/spark/myApps/TestHBase.jar >>> >>> ------------------ 原始邮件 ------------------ >>> 发件人: "Bill Q";<bill.q....@gmail.com>; >>> 发送时间: 2015年5月20日(星期三) 晚上10:13 >>> 收件人: "donhoff_h"<165612...@qq.com>; >>> 抄送: "yuzhihong"<yuzhih...@gmail.com>; "user"<user@spark.apache.org>; >>> 主题: Re: How to use spark to access HBase with Security enabled >>> >>> I have similar problem that I cannot pass the HBase configuration file as >>> extra classpath to Spark any more using >>> spark.executor.extraClassPath=MY_HBASE_CONF_DIR in the Spark 1.3. We used >>> to run this in 1.2 without any problem. >>> >>>> On Tuesday, May 19, 2015, donhoff_h <165612...@qq.com> wrote: >>>> >>>> Sorry, this ref does not help me. I have set up the configuration in >>>> hbase-site.xml. But it seems there are still some extra configurations to >>>> be set or APIs to be called to make my spark program be able to pass the >>>> authentication with the HBase. >>>> >>>> Does anybody know how to set authentication to a secured HBase in a spark >>>> program which use the API "newAPIHadoopRDD" to get information from HBase? >>>> >>>> Many Thanks! >>>> >>>> ------------------ 原始邮件 ------------------ >>>> 发件人: "yuzhihong";<yuzhih...@gmail.com>; >>>> 发送时间: 2015年5月19日(星期二) 晚上9:54 >>>> 收件人: "donhoff_h"<165612...@qq.com>; >>>> 抄送: "user"<user@spark.apache.org>; >>>> 主题: Re: How to use spark to access HBase with Security enabled >>>> >>>> Please take a look at: >>>> http://hbase.apache.org/book.html#_client_side_configuration_for_secure_operation >>>> >>>> Cheers >>>> >>>>> On Tue, May 19, 2015 at 5:23 AM, donhoff_h <165612...@qq.com> wrote: >>>>> >>>>> The principal is sp...@bgdt.dev.hrb. It is the user that I used to run my >>>>> spark programs. I am sure I have run the kinit command to make it take >>>>> effect. And I also used the HBase Shell to verify that this user has the >>>>> right to scan and put the tables in HBase. >>>>> >>>>> Now I still have no idea how to solve this problem. Can anybody help me >>>>> to figure it out? Many Thanks! >>>>> >>>>> ------------------ 原始邮件 ------------------ >>>>> 发件人: "yuzhihong";<yuzhih...@gmail.com>; >>>>> 发送时间: 2015年5月19日(星期二) 晚上7:55 >>>>> 收件人: "donhoff_h"<165612...@qq.com>; >>>>> 抄送: "user"<user@spark.apache.org>; >>>>> 主题: Re: How to use spark to access HBase with Security enabled >>>>> >>>>> Which user did you run your program as ? >>>>> >>>>> Have you granted proper permission on hbase side ? >>>>> >>>>> You should also check master log to see if there was some clue. >>>>> >>>>> Cheers >>>>> >>>>> >>>>> >>>>>> On May 19, 2015, at 2:41 AM, donhoff_h <165612...@qq.com> wrote: >>>>>> >>>>>> Hi, experts. >>>>>> >>>>>> I ran the "HBaseTest" program which is an example from the Apache Spark >>>>>> source code to learn how to use spark to access HBase. But I met the >>>>>> following exception: >>>>>> Exception in thread "main" >>>>>> org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after >>>>>> attempts=36, exceptions: >>>>>> Tue May 19 16:59:11 CST 2015, null, java.net.SocketTimeoutException: >>>>>> callTimeout=60000, callDuration=68648: row 'spark_t01,,00000000000000' >>>>>> on table 'hbase:meta' at region=hbase:meta,,1.1588230740, >>>>>> hostname=bgdt01.dev.hrb,16020,1431412877700, seqNum=0 >>>>>> >>>>>> I also checked the RegionServer Log of the host "bgdt01.dev.hrb" listed >>>>>> in the above exception. I found a few entries like the following one: >>>>>> 2015-05-19 16:59:11,143 DEBUG >>>>>> [RpcServer.reader=2,bindAddress=bgdt01.dev.hrb,port=16020] >>>>>> ipc.RpcServer: RpcServer.listener,port=16020: Caught exception while >>>>>> reading:Authentication is required >>>>>> >>>>>> The above entry did not point to my program clearly. But the time is >>>>>> very near. Since my hbase version is HBase1.0.0 and I set security >>>>>> enabled, I doubt the exception was caused by the Kerberos >>>>>> authentication. But I am not sure. >>>>>> >>>>>> Do anybody know if my guess is right? And if I am right, could anybody >>>>>> tell me how to set Kerberos Authentication in a spark program? I don't >>>>>> know how to do it. I already checked the API doc , but did not found any >>>>>> API useful. Many Thanks! >>>>>> >>>>>> By the way, my spark version is 1.3.0. I also paste the code of >>>>>> "HBaseTest" in the following: >>>>>> ***************************Source Code****************************** >>>>>> object HBaseTest { >>>>>> def main(args: Array[String]) { >>>>>> val sparkConf = new SparkConf().setAppName("HBaseTest") >>>>>> val sc = new SparkContext(sparkConf) >>>>>> val conf = HBaseConfiguration.create() >>>>>> conf.set(TableInputFormat.INPUT_TABLE, args(0)) >>>>>> >>>>>> // Initialize hBase table if necessary >>>>>> val admin = new HBaseAdmin(conf) >>>>>> if (!admin.isTableAvailable(args(0))) { >>>>>> val tableDesc = new HTableDescriptor(args(0)) >>>>>> admin.createTable(tableDesc) >>>>>> } >>>>>> >>>>>> val hBaseRDD = sc.newAPIHadoopRDD(conf, classOf[TableInputFormat], >>>>>> classOf[org.apache.hadoop.hbase.io.ImmutableBytesWritable], >>>>>> classOf[org.apache.hadoop.hbase.client.Result]) >>>>>> >>>>>> hBaseRDD.count() >>>>>> >>>>>> sc.stop() >>>>>> } >>>>>> } >>> >>> >>> -- >>> Many thanks. >>> >>> >>> Bill >>