Re: RE: Can't access remote Hive table from spark
Hi, My spark-env.sh has the following entries with respect to classpath: export SPARK_CLASSPATH=$SPARK_CLASSPATH:/usr/lib/hive/lib/*:/etc/hive/conf/ -Skanda On Sun, Feb 1, 2015 at 11:45 AM, guxiaobo1982 wrote: > Hi Skanda, > > How do set up your SPARK_CLASSPATH? > > I add the following line to my SPARK_HOME/conf/spark-env.sh , and still > got the same error. > > export SPARK_CLASSPATH=${SPARK_CLASSPATH}:/etc/hive/conf > > > -- Original -- > *From: * "Skanda Prasad";; > *Send time:* Monday, Jan 26, 2015 7:41 AM > *To:* ""; "user@spark.apache.org"< > user@spark.apache.org>; > *Subject: * RE: Can't access remote Hive table from spark > > This happened to me as well, putting hive-site.xml inside conf doesn't > seem to work. Instead I added /etc/hive/conf to SPARK_CLASSPATH and it > worked. You can try this approach. > > -Skanda > -- > From: guxiaobo1982 > Sent: 25-01-2015 13:50 > To: user@spark.apache.org > Subject: Can't access remote Hive table from spark > > Hi, > I built and started a single node standalone Spark 1.2.0 cluster along > with a single node Hive 0.14.0 instance installed by Ambari 1.17.0. On the > Spark and Hive node I can create and query tables inside Hive, and on > remote machines I can submit the SparkPi example to the Spark master. But > I failed to run the following example code : > > public class SparkTest { > > public static void main(String[] args) > > { > > String appName= "This is a test application"; > > String master="spark://lix1.bh.com:7077"; > > SparkConf conf = new SparkConf().setAppName(appName).setMaster(master); > > JavaSparkContext sc = new JavaSparkContext(conf); > > JavaHiveContext sqlCtx = new > org.apache.spark.sql.hive.api.java.JavaHiveContext(sc); > > //sqlCtx.sql("CREATE TABLE IF NOT EXISTS src (key INT, value STRING)"); > > //sqlCtx.sql("LOAD DATA LOCAL INPATH '/opt/spark/examples/src > /main/resources/kv1.txt' INTO TABLE src"); > > // Queries are expressed in HiveQL. > > List rows = sqlCtx.sql("FROM src SELECT key, value").collect(); > > System.out.print("I got " + rows.size() + " rows \r\n"); > > sc.close();} > > } > > > Exception in thread "main" > org.apache.hadoop.hive.ql.metadata.InvalidTableException: Table not found > src > > at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:980) > > at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:950) > > at org.apache.spark.sql.hive.HiveMetastoreCatalog.lookupRelation( > HiveMetastoreCatalog.scala:70) > > at org.apache.spark.sql.hive.HiveContext$$anon$2.org > $apache$spark$sql$catalyst$analysis$OverrideCatalog$$super$lookupRelation( > HiveContext.scala:253) > > at > org.apache.spark.sql.catalyst.analysis.OverrideCatalog$$anonfun$lookupRelation$3.apply( > Catalog.scala:141) > > at > org.apache.spark.sql.catalyst.analysis.OverrideCatalog$$anonfun$lookupRelation$3.apply( > Catalog.scala:141) > > at scala.Option.getOrElse(Option.scala:120) > > at > org.apache.spark.sql.catalyst.analysis.OverrideCatalog$class.lookupRelation( > Catalog.scala:141) > > at org.apache.spark.sql.hive.HiveContext$$anon$2.lookupRelation( > HiveContext.scala:253) > > at > org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$5.applyOrElse( > Analyzer.scala:143) > > at > org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$5.applyOrElse( > Analyzer.scala:138) > > at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown( > TreeNode.scala:144) > > at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply( > TreeNode.scala:162) > > at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) > > at scala.collection.Iterator$class.foreach(Iterator.scala:727) > > at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) > > at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48 > ) > > at scala.collection.mutable.ArrayBuffer.$plus$plus$eq( > ArrayBuffer.scala:103) > > at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47 > ) > > at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273) > > at scala.collection.AbstractIterator.to(Iterator.scala:1157) > > at scala.collection.TraversableOnce$class.toBuffer( > TraversableOnce.scala:265) > > at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157) > > at scala.collection.TraversableOnce$class.toArray( > TraversableOnce.scala:252) > > at scala.collection.AbstractIterator.toArray(Iterator.scala:1157) > > at org.apache.spark.sql.catalyst.trees.TreeNode.transformChildrenDown( > TreeNode.scala:191) > > at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown( > TreeNode.scala:147) > > at org.apache.spark.sql.catalyst.trees.TreeNode.transform( > TreeNode.scala:135) > > at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.apply( > Analyzer.scala:138) > > at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.apply( > Analyzer.scala:137) > > at
Re: RE: Can't access remote Hive table from spark
The following line does not work too export SPARK_CLASSPATH=/etc/hive/conf -- Original -- From: "guxiaobo1982";; Send time: Sunday, Feb 1, 2015 2:15 PM To: "Skanda Prasad"; "user@spark.apache.org"; Cc: "徐涛"<77044...@qq.com>; Subject: Re: RE: Can't access remote Hive table from spark Hi Skanda, How do set up your SPARK_CLASSPATH? I add the following line to my SPARK_HOME/conf/spark-env.sh , and still got the same error. export SPARK_CLASSPATH=${SPARK_CLASSPATH}:/etc/hive/conf -- Original -- From: "Skanda Prasad";; Send time: Monday, Jan 26, 2015 7:41 AM To: ""; "user@spark.apache.org"; Subject: RE: Can't access remote Hive table from spark This happened to me as well, putting hive-site.xml inside conf doesn't seem to work. Instead I added /etc/hive/conf to SPARK_CLASSPATH and it worked. You can try this approach. -Skanda From: guxiaobo1982 Sent: 25-01-2015 13:50 To: user@spark.apache.org Subject: Can't access remote Hive table from spark Hi, I built and started a single node standalone Spark 1.2.0 cluster along with a single node Hive 0.14.0 instance installed by Ambari 1.17.0. On the Spark and Hive node I can create and query tables inside Hive, and on remote machines I can submit the SparkPi example to the Spark master. But I failed to run the following example code : public class SparkTest { public static void main(String[] args) { String appName= "This is a test application"; String master="spark://lix1.bh.com:7077"; SparkConf conf = new SparkConf().setAppName(appName).setMaster(master); JavaSparkContext sc = new JavaSparkContext(conf); JavaHiveContext sqlCtx = new org.apache.spark.sql.hive.api.java.JavaHiveContext(sc); //sqlCtx.sql("CREATE TABLE IF NOT EXISTS src (key INT, value STRING)"); //sqlCtx.sql("LOAD DATA LOCAL INPATH '/opt/spark/examples/src/main/resources/kv1.txt' INTO TABLE src"); // Queries are expressed in HiveQL. List rows = sqlCtx.sql("FROM src SELECT key, value").collect(); System.out.print("I got " + rows.size() + " rows \r\n"); sc.close();} } Exception in thread "main" org.apache.hadoop.hive.ql.metadata.InvalidTableException: Table not found src at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:980) at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:950) at org.apache.spark.sql.hive.HiveMetastoreCatalog.lookupRelation(HiveMetastoreCatalog.scala:70) at org.apache.spark.sql.hive.HiveContext$$anon$2.org$apache$spark$sql$catalyst$analysis$OverrideCatalog$$super$lookupRelation(HiveContext.scala:253) at org.apache.spark.sql.catalyst.analysis.OverrideCatalog$$anonfun$lookupRelation$3.apply(Catalog.scala:141) at org.apache.spark.sql.catalyst.analysis.OverrideCatalog$$anonfun$lookupRelation$3.apply(Catalog.scala:141) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.sql.catalyst.analysis.OverrideCatalog$class.lookupRelation(Catalog.scala:141) at org.apache.spark.sql.hive.HiveContext$$anon$2.lookupRelation(HiveContext.scala:253) at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$5.applyOrElse(Analyzer.scala:143) at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$5.applyOrElse(Analyzer.scala:138) at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:144) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:162) at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) at scala.collection.Iterator$class.foreach(Iterator.scala:727) at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48) at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103) at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47) at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273) at scala.collection.AbstractIterator.to(Iterator.scala:1157) at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265) at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157) at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252) at scala.collection.AbstractIterator.toArray(Iterator.scala:1157) at org.
Re: RE: Can't access remote Hive table from spark
Hi Skanda, How do set up your SPARK_CLASSPATH? I add the following line to my SPARK_HOME/conf/spark-env.sh , and still got the same error. export SPARK_CLASSPATH=${SPARK_CLASSPATH}:/etc/hive/conf -- Original -- From: "Skanda Prasad";; Send time: Monday, Jan 26, 2015 7:41 AM To: ""; "user@spark.apache.org"; Subject: RE: Can't access remote Hive table from spark This happened to me as well, putting hive-site.xml inside conf doesn't seem to work. Instead I added /etc/hive/conf to SPARK_CLASSPATH and it worked. You can try this approach. -Skanda From: guxiaobo1982 Sent: 25-01-2015 13:50 To: user@spark.apache.org Subject: Can't access remote Hive table from spark Hi, I built and started a single node standalone Spark 1.2.0 cluster along with a single node Hive 0.14.0 instance installed by Ambari 1.17.0. On the Spark and Hive node I can create and query tables inside Hive, and on remote machines I can submit the SparkPi example to the Spark master. But I failed to run the following example code : public class SparkTest { public static void main(String[] args) { String appName= "This is a test application"; String master="spark://lix1.bh.com:7077"; SparkConf conf = new SparkConf().setAppName(appName).setMaster(master); JavaSparkContext sc = new JavaSparkContext(conf); JavaHiveContext sqlCtx = new org.apache.spark.sql.hive.api.java.JavaHiveContext(sc); //sqlCtx.sql("CREATE TABLE IF NOT EXISTS src (key INT, value STRING)"); //sqlCtx.sql("LOAD DATA LOCAL INPATH '/opt/spark/examples/src/main/resources/kv1.txt' INTO TABLE src"); // Queries are expressed in HiveQL. List rows = sqlCtx.sql("FROM src SELECT key, value").collect(); System.out.print("I got " + rows.size() + " rows \r\n"); sc.close();} } Exception in thread "main" org.apache.hadoop.hive.ql.metadata.InvalidTableException: Table not found src at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:980) at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:950) at org.apache.spark.sql.hive.HiveMetastoreCatalog.lookupRelation(HiveMetastoreCatalog.scala:70) at org.apache.spark.sql.hive.HiveContext$$anon$2.org$apache$spark$sql$catalyst$analysis$OverrideCatalog$$super$lookupRelation(HiveContext.scala:253) at org.apache.spark.sql.catalyst.analysis.OverrideCatalog$$anonfun$lookupRelation$3.apply(Catalog.scala:141) at org.apache.spark.sql.catalyst.analysis.OverrideCatalog$$anonfun$lookupRelation$3.apply(Catalog.scala:141) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.sql.catalyst.analysis.OverrideCatalog$class.lookupRelation(Catalog.scala:141) at org.apache.spark.sql.hive.HiveContext$$anon$2.lookupRelation(HiveContext.scala:253) at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$5.applyOrElse(Analyzer.scala:143) at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$5.applyOrElse(Analyzer.scala:138) at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:144) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:162) at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) at scala.collection.Iterator$class.foreach(Iterator.scala:727) at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48) at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103) at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47) at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273) at scala.collection.AbstractIterator.to(Iterator.scala:1157) at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265) at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157) at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252) at scala.collection.AbstractIterator.toArray(Iterator.scala:1157) at org.apache.spark.sql.catalyst.trees.TreeNode.transformChildrenDown(TreeNode.scala:191) at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:147) at org.apache.spark.sql.catalyst.trees.TreeNode.transform(TreeNode.scala:135) at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.apply(Analyzer.scala:138) at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.apply(Analyzer.scala:137) at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply