Hi Abel, Apologies, but I have been quite busy. The version I am using is here:
https://github.com/barkhorn/SparkOnHBase. The tgt does look valid all in all, so I will revert back to my thought that there is an issue with the classpath on the submit. Thanks On 22 November 2016 at 10:14, Abel Fernández <[email protected]> wrote: > I think the tgt is not the problem, checking the logs I can see: > > 16/11/22 10:06:40 DEBUG [main] YarnSparkHadoopUtil: running as user: hbase > 16/11/22 10:06:40 DEBUG [main] UserGroupInformation: hadoop login > 16/11/22 10:06:40 DEBUG [main] UserGroupInformation: hadoop login commit > 16/11/22 10:06:40 DEBUG [main] UserGroupInformation: using kerberos > user:[email protected] > 16/11/22 10:06:40 DEBUG [main] UserGroupInformation: Using user: > "[email protected]" with name [email protected] > 16/11/22 10:06:40 DEBUG [main] UserGroupInformation: User entry: > "[email protected]" > 16/11/22 10:06:40 DEBUG [main] UserGroupInformation: UGI > loginUser:[email protected] (auth:KERBEROS) > 16/11/22 10:06:40 DEBUG [main] UserGroupInformation: PrivilegedAction > as:hbase (auth:SIMPLE) > from:org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser( > SparkHadoopUtil.scala:68) > 16/11/22 10:06:40 DEBUG [TGT Renewer for [email protected]] > UserGroupInformation: Found tgt Ticket (hex) = > 0000: 61 82 01 61 30 82 01 5D A0 03 02 01 05 A1 12 1B a..a0..]........ > 0010: 10 53 41 4E 54 41 4E 44 45 52 55 4B 2E 43 4F 52 .COMPANY.COR > 0020: 50 A2 25 30 23 A0 03 02 01 02 A1 1C 30 1A 1B 06 P.%0#.......0... > 0030: 6B 72 62 74 67 74 1B 10 53 41 4E 54 41 4E 44 45 > .... > > Client Principal = [email protected] > Server Principal = krbtgt/[email protected] > Session Key = EncryptionKey: keyType=18 keyBytes (hex dump)= > 0000: 2D 9D 67 F5 7C B4 15 17 AE DE BE A5 B9 2C 15 95 -.g..........,.. > 0010: E6 6B 1C 4A 02 A2 44 67 6D D2 16 36 4A DA 11 82 .k.J..Dgm..6J... > > > Forwardable Ticket true > Forwarded Ticket false > Proxiable Ticket false > Proxy Ticket false > Postdated Ticket false > Renewable Ticket true > Initial Ticket true > Auth Time = Tue Nov 22 03:39:05 CET 2016 > Start Time = Tue Nov 22 03:39:05 CET 2016 > End Time = Wed Nov 23 03:39:05 CET 2016 > Renew Till = Tue Nov 29 03:39:05 CET 2016 > Client Addresses Null > 16/11/22 10:06:40 DEBUG [TGT Renewer for [email protected]] > UserGroupInformation: Current time is 1479805600691 > 16/11/22 10:06:40 DEBUG [TGT Renewer for [email protected]] > UserGroupInformation: Next refresh is 1479851465000 > > Is the retrofit version you are using public? We are using CDH 5.5.4 but > with a backported version of hbase on spark from the latest code released > on github. > > On Mon, 21 Nov 2016 at 21:11 Nkechi Achara <[email protected]> > wrote: > > > I am still convinced that it could be due to class path issues but I > might > > be missing something. > > > > Just to make sure.... Have you checked the use of the principal / keytab > > only on the driver only so you can make sure the tgt is valid. > > > > I am using the same config but with CDH 5.5.2, but I am using a retrofit > of > > cloudera labs hbase on spark. > > > > Thanks > > > > On 21 Nov 2016 5:32 p.m., "Abel Fernández" <[email protected]> wrote: > > > > > I have included into the spark-submit and into all nodemanagers and > > drivers > > > the krb5.conf and the jaas.conf, but I am still having the same > problem. > > > > > > I think the problem is this piece of code, it is trying to execute a > > > function into the executors and for some reason, the executors cannot > > get a > > > valid credentials. > > > > > > /** > > > * A simple enrichment of the traditional Spark RDD foreachPartition. > > > * This function differs from the original in that it offers the > > > * developer access to a already connected Connection object > > > * > > > * Note: Do not close the Connection object. All Connection > > > * management is handled outside this method > > > * > > > * @param rdd Original RDD with data to iterate over > > > * @param f Function to be given a iterator to iterate through > > > * the RDD values and a Connection object to interact > > > * with HBase > > > */ > > > def foreachPartition[T](rdd: RDD[T], > > > f: (Iterator[T], Connection) => Unit):Unit = { > > > rdd.foreachPartition( > > > it => hbaseForeachPartition(broadcastedConf, it, f)) > > > } > > > > > > > > > The first thing is trying to do the hbaseForeachPartition is getting > the > > > credentials but I think this code is never executed: > > > > > > /** > > > * underlining wrapper all foreach functions in HBaseContext > > > */ > > > private def hbaseForeachPartition[T](configBroadcast: > > > > > > Broadcast[SerializableWritable[Configuration]], > > > it: Iterator[T], > > > f: (Iterator[T], Connection) => > > > Unit) = { > > > > > > val config = getConf(configBroadcast) > > > > > > applyCreds > > > // specify that this is a proxy user > > > val smartConn = HBaseConnectionCache.getConnection(config) > > > f(it, smartConn.connection) > > > smartConn.close() > > > } > > > > > > > > > This is the latest spark-submit I am using: > > > #!/bin/bash > > > > > > SPARK_CONF_DIR=conf-hbase spark-submit --master yarn-cluster \ > > > --executor-memory 6G \ > > > --num-executors 10 \ > > > --queue cards \ > > > --executor-cores 4 \ > > > --driver-java-options "-Dlog4j.configuration=file:log4j.properties" > \ > > > --driver-java-options "-Djava.security.krb5.conf=/etc/krb5.conf" \ > > > --driver-java-options > > > "-Djava.security.auth.login.config=/opt/company/conf/jaas.conf" \ > > > --driver-class-path "$2" \ > > > --jars file:/opt/company/lib/rocksdbjni-4.5.1.jar \ > > > --conf > > > "spark.driver.extraClassPath=/var/cloudera/parcels/CDH/lib/ > > > hbase/lib/htrace-core-3.2.0-incubating.jar:/var/cloudera/ > > > parcels/CDH/jars/hbase-server-1.0.0-cdh5.5.4.jar:/var/ > > > cloudera/parcels/CDH/jars/hbase-common-1.0.0-cdh5.5.4. > > > jar:/var/cloudera/parcels/CDH/lib/hbase/lib/hbase-client-1. > > > 0.0-cdh5.5.4.jar:/var/cloudera/parcels/CDH/lib/ > > > hbase/lib/hbase-protocol-1.0.0-cdh5.5.4.jar:/opt/orange/ > > > lib/rocksdbjni-4.5.1.jar:/var/cloudera/parcels/CLABS_ > > > PHOENIX-4.5.2-1.clabs_phoenix1.2.0.p0.774/lib/ > > > phoenix/lib/phoenix-core-1.2.0.jar:/var/cloudera/parcels/ > > > CDH/jars/hadoop-mapreduce-client-core-2.6.0-cdh5.5.4.jar" > > > \ > > > --conf > > > "spark.executor.extraClassPath=/var/cloudera/ > parcels/CDH/lib/hbase/lib/ > > > htrace-core-3.2.0-incubating.jar:/var/cloudera/parcels/CDH/ > > > jars/hbase-server-1.0.0-cdh5.5.4.jar:/var/cloudera/parcels/ > > > CDH/jars/hbase-common-1.0.0-cdh5.5.4.jar:/var/cloudera/ > > > parcels/CDH/lib/hbase/lib/hbase-client-1.0.0-cdh5.5.4. > > > jar:/var/cloudera/parcels/CDH/lib/hbase/lib/hbase-protocol- > > > 1.0.0-cdh5.5.4.jar:/opt/orange/lib/rocksdbjni-4.5.1. > > > jar:/var/cloudera/parcels/CLABS_PHOENIX-4.5.2-1.clabs_ > > > phoenix1.2.0.p0.774/lib/phoenix/lib/phoenix-core-1.2. > > > 0.jar:/var/cloudera/parcels/CDH/jars/hadoop-mapreduce- > > > client-core-2.6.0-cdh5.5.4.jar"\ > > > --principal [email protected] \ > > > --keytab /opt/company/conf/hbase.keytab \ > > > --files > > > "owl.properties,conf-hbase/log4j.properties,conf-hbase/ > > > hbase-site.xml,conf-hbase/core-site.xml,$2" > > > \ > > > --class $1 \ > > > cards-batch-$3-jar-with-dependencies.jar $2 > > > > > > > > > > > > On Fri, 18 Nov 2016 at 16:37 Abel Fernández <[email protected]> > > wrote: > > > > > > > No worries. > > > > > > > > This is the spark version we are using: 1.5.0-cdh5.5.4 > > > > > > > > I have to use Hbase context, it is the first parameter for the > method I > > > am > > > > using to generate the HFiles (HbaseRDDFunctions. > hbaseBulkLoadThinRows) > > > > > > > > On Fri, 18 Nov 2016 at 16:06 Nkechi Achara <[email protected]> > > > > wrote: > > > > > > > > Sorry on my way to a flight. > > > > > > > > Read is required for a keytab to be permissioned properly. So that > > looks > > > > fine in your case. > > > > > > > > I do not have my PC with me, but have you tried to use Hbase without > > > using > > > > Hbase context. > > > > > > > > Also which version of Spark are you using? > > > > > > > > On 18 Nov 2016 16:01, "Abel Fernández" <[email protected]> wrote: > > > > > > > > > Yep, the keytab is also in the driver into the same location. > > > > > > > > > > -rw-r--r-- 1 hbase root 370 Nov 16 17:13 hbase.keytab > > > > > > > > > > Do you know what are the permissions that the keytab should have? > > > > > > > > > > > > > > > > > > > > On Fri, 18 Nov 2016 at 14:19 Nkechi Achara < > [email protected]> > > > > > wrote: > > > > > > > > > > > Sorry just realised you had the submit command in the attached > > docs. > > > > > > > > > > > > Can I ask if the keytab is also on the driver in the same > location? > > > > > > > > > > > > The spark option normally requires the keytab to be on the driver > > so > > > it > > > > > can > > > > > > pick it up and pass it to yarn etc to perform the kerberos > > > operations. > > > > > > > > > > > > On 18 Nov 2016 3:10 p.m., "Abel Fernández" <[email protected] > > > > > > wrote: > > > > > > > > > > > > > Hi Nkechi, > > > > > > > > > > > > > > Thank for your early response. > > > > > > > > > > > > > > I am currently specifying the principal and the keytab in the > > > > > > spark-submit, > > > > > > > the keytab is in the same location in every node manager. > > > > > > > > > > > > > > SPARK_CONF_DIR=conf-hbase spark-submit --master yarn-cluster \ > > > > > > > --executor-memory 6G \ > > > > > > > --num-executors 10 \ > > > > > > > --queue cards \ > > > > > > > --executor-cores 4 \ > > > > > > > --driver-java-options "-Dlog4j.configuration=file: > > > log4j.properties" > > > > > \ > > > > > > > --driver-class-path "$2" \ > > > > > > > --jars file:/opt/orange/lib/rocksdbjni-4.5.1.jar \ > > > > > > > --conf > > > > > > > "spark.driver.extraClassPath=/var/cloudera/parcels/CDH/lib/ > > > > > > > hbase/lib/htrace-core-3.2.0-incubating.jar:/var/cloudera/ > > > > > > > parcels/CDH/jars/hbase-server-1.0.0-cdh5.5.4.jar:/var/ > > > > > > > cloudera/parcels/CDH/jars/hbase-common-1.0.0-cdh5.5.4. > > > > > > > jar:/var/cloudera/parcels/CDH/lib/hbase/lib/hbase-client-1. > > > > > > > 0.0-cdh5.5.4.jar:/var/cloudera/parcels/CDH/lib/ > > > > > > > hbase/lib/hbase-protocol-1.0.0-cdh5.5.4.jar:/opt/orange/ > > > > > > > lib/rocksdbjni-4.5.1.jar:/var/cloudera/parcels/CLABS_ > > > > > > > PHOENIX-4.5.2-1.clabs_phoenix1.2.0.p0.774/lib/ > > > > > > > phoenix/lib/phoenix-core-1.2.0.jar:/var/cloudera/parcels/ > > > > > > > CDH/jars/hadoop-mapreduce-client-core-2.6.0-cdh5.5.4.jar" > > > > > > > \ > > > > > > > --conf > > > > > > > "spark.executor.extraClassPath=/var/cloudera/ > > > > > parcels/CDH/lib/hbase/lib/ > > > > > > > htrace-core-3.2.0-incubating.jar:/var/cloudera/parcels/CDH/ > > > > > > > jars/hbase-server-1.0.0-cdh5.5.4.jar:/var/cloudera/parcels/ > > > > > > > CDH/jars/hbase-common-1.0.0-cdh5.5.4.jar:/var/cloudera/ > > > > > > > parcels/CDH/lib/hbase/lib/hbase-client-1.0.0-cdh5.5.4. > > > > > > > jar:/var/cloudera/parcels/CDH/lib/hbase/lib/hbase-protocol- > > > > > > > 1.0.0-cdh5.5.4.jar:/opt/orange/lib/rocksdbjni-4.5.1. > > > > > > > jar:/var/cloudera/parcels/CLABS_PHOENIX-4.5.2-1.clabs_ > > > > > > > phoenix1.2.0.p0.774/lib/phoenix/lib/phoenix-core-1.2. > > > > > > > 0.jar:/var/cloudera/parcels/CDH/jars/hadoop-mapreduce- > > > > > > > client-core-2.6.0-cdh5.5.4.jar"\ > > > > > > > --principal [email protected] \ > > > > > > > --keytab /opt/company/conf/hbase.keytab \ > > > > > > > --files > > > > > > > "owl.properties,conf-hbase/log4j.properties,conf-hbase/ > > > > > > > hbase-site.xml,conf-hbase/core-site.xml,$2" > > > > > > > \ > > > > > > > --class $1 \ > > > > > > > cards-batch-$3-jar-with-dependencies.jar $2 > > > > > > > > > > > > > > On Fri, 18 Nov 2016 at 14:01 Nkechi Achara < > > > [email protected]> > > > > > > > wrote: > > > > > > > > > > > > > > > Can you use the principal and keytab options in Spark submit? > > > These > > > > > > > should > > > > > > > > circumvent this issue. > > > > > > > > > > > > > > > > On 18 Nov 2016 1:01 p.m., "Abel Fernández" < > > [email protected] > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > Hello, > > > > > > > > > > > > > > > > > > We are having problems with the delegation of the token in > a > > > > secure > > > > > > > > > cluster: Delegation Token can be issued only with kerberos > or > > > web > > > > > > > > > authentication > > > > > > > > > > > > > > > > > > We have a spark process which is generating the hfiles to > be > > > > loaded > > > > > > > into > > > > > > > > > hbase. To generate these hfiles, (we are using a > back-ported > > > > > version > > > > > > of > > > > > > > > the > > > > > > > > > latest hbase/spark code), we are using this method > > > > > HBaseRDDFunctions. > > > > > > > > > hbaseBulkLoadThinRows. > > > > > > > > > > > > > > > > > > I think the problem is in the below piece of code. This > > > function > > > > is > > > > > > > > > executed in every partition of the rdd, when the executors > > are > > > > > trying > > > > > > > to > > > > > > > > > execute the code, the executors do not have a valid > kerberos > > > > > > credential > > > > > > > > and > > > > > > > > > cannot execute anything. > > > > > > > > > > > > > > > > > > private def hbaseForeachPartition[T](configBroadcast: > > > > > > > > > > > > > > > Broadcast[SerializableWritable[ > > > > > > > > > Configuration]], > > > > > > > > > it: Iterator[T], > > > > > > > > > f: (Iterator[T], > > > > > Connection) > > > > > > => > > > > > > > > > Unit) = { > > > > > > > > > > > > > > > > > > val config = getConf(configBroadcast) > > > > > > > > > > > > > > > > > > applyCreds > > > > > > > > > // specify that this is a proxy user > > > > > > > > > val smartConn = > > HBaseConnectionCache.getConnection(config) > > > > > > > > > f(it, smartConn.connection) > > > > > > > > > smartConn.close() > > > > > > > > > } > > > > > > > > > > > > > > > > > > I have attached the spark-submit and the complete error log > > > > trace. > > > > > > Has > > > > > > > > > anyone faced this problem before? > > > > > > > > > > > > > > > > > > Thanks in advance. > > > > > > > > > > > > > > > > > > Regards, > > > > > > > > > Abel. > > > > > > > > > -- > > > > > > > > > Un saludo - Best Regards. > > > > > > > > > Abel > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > Un saludo - Best Regards. > > > > > > > Abel > > > > > > > > > > > > > > > > > > -- > > > > > Un saludo - Best Regards. > > > > > Abel > > > > > > > > > > > > > -- > > > > Un saludo - Best Regards. > > > > Abel > > > > > > > -- > > > Un saludo - Best Regards. > > > Abel > > > > > > -- > Un saludo - Best Regards. > Abel >
