> > [S]ince Hive has a large number of dependencies, it is not included in the > default Spark assembly. In order to use Hive you must first run > ‘SPARK_HIVE=true > sbt/sbt assembly/assembly’ (or use -Phive for maven). This command builds > a new assembly jar that includes Hive. Note that this Hive assembly jar > must also be present on all of the worker nodes, as they will need access > to the Hive serialization and deserialization libraries (SerDes) in order > to acccess data stored in Hive.
On Fri, Jul 25, 2014 at 3:20 PM, Sameer Tilak <ssti...@live.com> wrote: > Hi Jerry, > > I am having trouble with this. May be something wrong with my import or > version etc. > > scala> import org.apache.spark.sql._; > import org.apache.spark.sql._ > > scala> val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc) > <console>:24: error: object hive is not a member of package > org.apache.spark.sql > val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc) > ^ > Here is what I see for autocompletion: > > scala> org.apache.spark.sql. > Row SQLContext SchemaRDD SchemaRDDLike api > catalyst columnar execution package parquet > test > > > ------------------------------ > Date: Fri, 25 Jul 2014 17:48:27 -0400 > > Subject: Re: Spark SQL and Hive tables > From: chiling...@gmail.com > To: user@spark.apache.org > > > Hi Sameer, > > The blog post you referred to is about Spark SQL. I don't think the intent > of the article is meant to guide you how to read data from Hive via Spark > SQL. So don't worry too much about the blog post. > > The programming guide I referred to demonstrate how to read data from Hive > using Spark SQL. It is a good starting point. > > Best Regards, > > Jerry > > > On Fri, Jul 25, 2014 at 5:38 PM, Sameer Tilak <ssti...@live.com> wrote: > > Hi Michael, > Thanks. I am not creating HiveContext, I am creating SQLContext. I am > using CDH 5.1. Can you please let me know which conf/ directory you are > talking about? > > ------------------------------ > From: mich...@databricks.com > Date: Fri, 25 Jul 2014 14:34:53 -0700 > > Subject: Re: Spark SQL and Hive tables > To: user@spark.apache.org > > > In particular, have you put your hive-site.xml in the conf/ directory? > Also, are you creating a HiveContext instead of a SQLContext? > > > On Fri, Jul 25, 2014 at 2:27 PM, Jerry Lam <chiling...@gmail.com> wrote: > > Hi Sameer, > > Maybe this page will help you: > https://spark.apache.org/docs/latest/sql-programming-guide.html#hive-tables > > Best Regards, > > Jerry > > > > On Fri, Jul 25, 2014 at 5:25 PM, Sameer Tilak <ssti...@live.com> wrote: > > Hi All, > I am trying to load data from Hive tables using Spark SQL. I am using > spark-shell. Here is what I see: > > val trainingDataTable = sql("""SELECT prod.prod_num, demographics.gender, > demographics.birth_year, demographics.income_group FROM prod p JOIN > demographics d ON d.user_id = p.user_id""") > > 14/07/25 14:18:46 INFO Analyzer: Max iterations (2) reached for batch > MultiInstanceRelations > 14/07/25 14:18:46 INFO Analyzer: Max iterations (2) reached for batch > CaseInsensitiveAttributeReferences > java.lang.RuntimeException: Table Not Found: prod. > > I have these tables in hive. I used show tables command to confirm this. > Can someone please let me know how do I make them accessible here? > > > > >