Re: Possible Hive problem with Spark 2.0.0 preview.

Doug Balog Fri, 20 May 2016 07:02:38 -0700

Some more info I’m still digging.
I’m just trying to do  `spark.table(“db.table”).count`from a spark-shell
“db.table” is just a hive table.

At commit b67668b this worked just fine and it returned the number of rows in 
db.table.
Starting at ca99171  "[SPARK-15073][SQL] Hide SparkSession constructor from the 
public” it fails with 

org.apache.spark.sql.AnalysisException: Database ‘db' does not exist;
  at 
org.apache.spark.sql.catalyst.catalog.ExternalCatalog.requireDbExists(ExternalCatalog.scala:37)
  at 
org.apache.spark.sql.catalyst.catalog.InMemoryCatalog.tableExists(InMemoryCatalog.scala:195)
  at 
org.apache.spark.sql.catalyst.catalog.InMemoryCatalog.requireTableExists(InMemoryCatalog.scala:63)
  at 
org.apache.spark.sql.catalyst.catalog.InMemoryCatalog.getTable(InMemoryCatalog.scala:186)
  at 
org.apache.spark.sql.catalyst.catalog.SessionCatalog.lookupRelation(SessionCatalog.scala:337)
  at org.apache.spark.sql.SparkSession.table(SparkSession.scala:524)
  at org.apache.spark.sql.SparkSession.table(SparkSession.scala:520)
  ... 48 elided

If I run 
"org.apache.spark.sql.SparkSession.builder.enableHiveSupport.getOrCreate.catalog.listDatabases.show(False)”

+-------------------------------------------------------------------------------------------------------------------------------------------------+-----------+-----------+
|name                                                                           

|description|locationUri|
+-------------------------------------------------------------------------------------------------------------------------------------------------+-----------+-----------+
|Database[name='default', description='default database', 
path='hdfs://ns/{CWD}/spark-warehouse']|
+-------------------------------------------------------------------------------------------------------------------------------------------------+-----------+—————+

 Where CWD is the current working directory of where I started my spark-shell.

It looks like this commit causes spark.catalog to be the internal one instead 
of the Hive one. 

Michael, I dont this this is related to the HDFS configurations, they are in 
/etc/hadoop/conf on each of the nodes in the cluster. 

Arun, I was referring to these docs, 
http://home.apache.org/~pwendell/spark-releases/spark-2.0.0-preview-docs/sql-programming-guide.html
 they need to be updated to no refer to HiveContext. 

I don’t think HiveContext should be marked as private[Hive], it should be 
public. 

I’ll keep digging.

Doug

> On May 19, 2016, at 6:52 PM, Reynold Xin <[email protected]> wrote:
> 
> The old one is deprecated but should still work though.
> 
> 
> On Thu, May 19, 2016 at 3:51 PM, Arun Allamsetty <[email protected]> 
> wrote:
> Hi Doug,
> 
> If you look at the API docs here: 
> http://home.apache.org/~pwendell/spark-releases/spark-2.0.0-preview-docs/api/scala/index.html#org.apache.spark.sql.hive.HiveContext,
>  you'll see
> Deprecate (Since version 2.0.0) Use SparkSession.builder.enableHiveSupport 
> instead
> So you probably need to use that.
> 
> Arun
> 
> On Thu, May 19, 2016 at 3:44 PM, Michael Armbrust <[email protected]> 
> wrote:
> 1. “val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc)”  doesn’t 
> work because “HiveContext not a member of org.apache.spark.sql.hive”  I 
> checked the documentation, and it looks like it should still work for 
> spark-2.0.0-preview-bin-hadoop2.7.tgz
> 
> HiveContext has been deprecated and moved to a 1.x compatibility package, 
> which you'll need to include explicitly.  Docs have not been updated yet.
>  
> 2. I also tried the new spark session, ‘spark.table(“db.table”)’, it fails 
> with a HDFS permission denied can’t write to “/user/hive/warehouse”
> 
> Where are the HDFS configurations located?  We might not be propagating them 
> correctly any more. 
> 
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: Possible Hive problem with Spark 2.0.0 preview.

Reply via email to