> On Oct 21, 2015, at 8:45 PM, Chester Chen <ches...@alpinenow.com> wrote: > > Doug > thanks for responding. > >>I think Spark just needs to be compiled against 1.2.1 > > Can you elaborate on this, or specific command you are referring ? > > In our build.scala, I was including the following > > "org.spark-project.hive" % "hive-exec" % "1.2.1.spark" intransitive() > > I am not sure how the Spark compilation is directly related to this, > please explain.
I was referring to this comment https://issues.apache.org/jira/browse/SPARK-6906?focusedCommentId=14712336&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14712336 And the updated documentation, http://spark.apache.org/docs/latest/sql-programming-guide.html#interacting-with-different-versions-of-hive-metastore Perhaps I misunderstood your question and why you are trying to compile against a different version of Hive. > > When we submit the spark job, the we call Spark Yarn Client.scala directly > ( not using spark-submit). > The client side is not depending on spark-assembly jar ( which is in the > hadoop cluster). The job submission actually failed in the client side. > > Currently we get around this by replace the spark's hive-exec with apache > hive-exec. > Why are you using the Spark Yarn Client.scala directly and not using the SparkLauncher that was introduced in 1.4.0 ? Doug > > > On Wed, Oct 21, 2015 at 5:27 PM, Doug Balog <d...@balog.net> wrote: > See comments below. > > > On Oct 21, 2015, at 5:33 PM, Chester Chen <ches...@alpinenow.com> wrote: > > > > All, > > > > just to see if this happens to other as well. > > > > This is tested against the > > > > spark 1.5.1 ( branch 1.5 with label 1.5.2-SNAPSHOT with commit on Tue > > Oct 6, 84f510c4fa06e43bd35e2dc8e1008d0590cbe266) > > > > Spark deployment mode : Spark-Cluster > > > > Notice that if we enable Kerberos mode, the spark yarn client fails with > > the following: > > > > Could not initialize class org.apache.hadoop.hive.ql.metadata.Hive > > java.lang.NoClassDefFoundError: Could not initialize class > > org.apache.hadoop.hive.ql.metadata.Hive > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > at > > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > > at > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > > at java.lang.reflect.Method.invoke(Method.java:606) > > at > > org.apache.spark.deploy.yarn.Client$.org$apache$spark$deploy$yarn$Client$$obtainTokenForHiveMetastore(Client.scala:1252) > > at > > org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:271) > > at > > org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:629) > > at > > org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:119) > > at org.apache.spark.deploy.yarn.Client.run(Client.scala:907) > > > > > > Diving in Yarn Client.scala code and tested against different dependencies > > and notice the followings: if the kerberos mode is enabled, > > Client.obtainTokenForHiveMetastore() will try to use scala reflection to > > get Hive and HiveConf and method on these method. > > > > val hiveClass = > > mirror.classLoader.loadClass("org.apache.hadoop.hive.ql.metadata.Hive") > > val hive = hiveClass.getMethod("get").invoke(null) > > > > val hiveConf = hiveClass.getMethod("getConf").invoke(hive) > > val hiveConfClass = > > mirror.classLoader.loadClass("org.apache.hadoop.hive.conf.HiveConf") > > > > val hiveConfGet = (param: String) => Option(hiveConfClass > > .getMethod("get", classOf[java.lang.String]) > > .invoke(hiveConf, param)) > > > > If the "org.spark-project.hive" % "hive-exec" % "1.2.1.spark" is used, > > then you will get above exception. But if we use the > > "org.apache.hive" % "hive-exec" "0.13.1-cdh5.2.0" > > The above method will not throw exception. > > > > Here some questions and comments > > 0) is this a bug ? > > I’m not an expert on this, but I think this might not be a bug. > The Hive integration was redone for 1.5.0, see > https://issues.apache.org/jira/browse/SPARK-6906 > and I think Spark just needs to be compiled against 1.2.1 > > > > > > 1) Why spark-hive hive-exec behave differently ? I understand spark-hive > > hive-exec has less dependencies > > but I would expect it functionally the same > > I don’t know. > > > 2) Where I can find the source code for spark-hive hive-exec ? > > I don’t know. > > > > > 3) regarding the method obtainTokenForHiveMetastore(), > > I would assume that the method will first check if the hive-metastore > > uri is present before > > trying to get the hive metastore tokens, it seems to invoke the > > reflection regardless the hive service in the cluster is enabled or not. > > Checking to see if the hive-megastore.uri is present before trying to get a > delegation token would be an improvement. > Also checking to see if we are running in cluster mode would be good, too. > I will file a JIRA and make these improvements. > > > 4) Noticed the obtainTokenForHBase() in the same Class (Client.java) catches > > case e: java.lang.NoClassDefFoundError => logDebug("HBase Class not > > found: " + e) > > and just ignore the exception ( log debug), > > but obtainTokenForHiveMetastore() does not catch NoClassDefFoundError > > exception, I guess this is the problem. > > private def obtainTokenForHiveMetastore(conf: Configuration, credentials: > > Credentials) { > > // rest of code > > } catch { > > case e: java.lang.NoSuchMethodException => { logInfo("Hive Method not > > found " + e); return } > > case e: java.lang.ClassNotFoundException => { logInfo("Hive Class not > > found " + e); return } > > case e: Exception => { logError("Unexpected Exception " + e) > > throw new RuntimeException("Unexpected exception", e) > > } > > } > > } > > I tested the code against different scenarios, it possible that I missed the > case where the class was not found. > obtainTokenForHBase() was implemented after obtainTokenForHive(). > > Cheers, > > Doug > > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org