Spark 1.6.0 HiveContext NPE

Shipper, Jay [USA] Wed, 03 Feb 2016 08:35:09 -0800

I’m upgrading an application from Spark 1.4.1 to Spark 1.6.0, and I’m getting a 
NullPointerException from HiveContext.  It’s happening while it tries to load 
some tables via JDBC from an external database (not Hive), using 
context.read().jdbc():


—
java.lang.NullPointerException
at org.apache.spark.sql.hive.client.ClientWrapper.conf(ClientWrapper.scala:205)
at 
org.apache.spark.sql.hive.HiveContext.hiveconf$lzycompute(HiveContext.scala:552)
at org.apache.spark.sql.hive.HiveContext.hiveconf(HiveContext.scala:551)
at 
org.apache.spark.sql.hive.HiveContext$$anonfun$configure$1.apply(HiveContext.scala:538)
at 
org.apache.spark.sql.hive.HiveContext$$anonfun$configure$1.apply(HiveContext.scala:537)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.immutable.List.foreach(List.scala:318)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.AbstractTraversable.map(Traversable.scala:105)
at org.apache.spark.sql.hive.HiveContext.configure(HiveContext.scala:537)
at 
org.apache.spark.sql.hive.HiveContext.metadataHive$lzycompute(HiveContext.scala:250)
at org.apache.spark.sql.hive.HiveContext.metadataHive(HiveContext.scala:237)
at org.apache.spark.sql.hive.HiveContext$$anon$2.<init>(HiveContext.scala:457)
at 
org.apache.spark.sql.hive.HiveContext.catalog$lzycompute(HiveContext.scala:457)
at org.apache.spark.sql.hive.HiveContext.catalog(HiveContext.scala:456)
at org.apache.spark.sql.hive.HiveContext$$anon$3.<init>(HiveContext.scala:473)
at 
org.apache.spark.sql.hive.HiveContext.analyzer$lzycompute(HiveContext.scala:473)
at org.apache.spark.sql.hive.HiveContext.analyzer(HiveContext.scala:472)
at 
org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:34)
at org.apache.spark.sql.DataFrame.<init>(DataFrame.scala:133)
at org.apache.spark.sql.DataFrame$.apply(DataFrame.scala:52)
at org.apache.spark.sql.SQLContext.baseRelationToDataFrame(SQLContext.scala:442)
at org.apache.spark.sql.DataFrameReader.jdbc(DataFrameReader.scala:223)
at org.apache.spark.sql.DataFrameReader.jdbc(DataFrameReader.scala:146)
—

Even though the application is not using Hive, HiveContext is used instead of 
SQLContext, for the additional functionality it provides.  There’s no 
hive-site.xml for the application, but this did not cause an issue for Spark 
1.4.1.

Does anyone have an idea about what’s changed from 1.4.1 to 1.6.0 that could 
explain this NPE?  The only obvious change I’ve noticed for HiveContext is that 
the default warehouse location is different (1.4.1 - current directory, 1.6.0 - 
/user/hive/warehouse), but I verified that this NPE happens even when 
/user/hive/warehouse exists and is readable/writeable for the application.  In 
terms of changes to the application to work with Spark 1.6.0, the only one that 
might be relevant to this issue is the upgrade in the Hadoop dependencies to 
match what Spark 1.6.0 uses (2.6.0-cdh5.7.0-SNAPSHOT).

Thanks,
Jay

Spark 1.6.0 HiveContext NPE

Reply via email to