Re: sqlContext can't find tables, while HiveContext can

Jeff Zhang Thu, 24 Nov 2016 15:57:41 -0800

First you need to figure out where is the table. Is this table registered
in spark sql code or a hive table ?
If it is hive table, then check whether you put hive-site.xml on classpath
and configure metastore uri correctly in hive-site.xml.
look at the interpreter log to see which metastore it is using.


Ruslan Dautkhanov <[email protected]>于2016年11月25日周五 上午4:04写道：

> Problem 1, with sqlContext)
> Spark 1.6
> CDH 5.8.3
> Zeppelin 0.6.2
>
> Running
>
> sqlCtx = SQLContext(sc)
> sqlCtx.sql('select * from marketview.spend_dim')
>
>
> shows exception "Table not found" .
> The same runs find when using hiveContext.
> See full stack in [1]
> The same stack in the log file [2].
>
> I probably wouldn't send this message seeking for your help,
> but using hiveContext gives its own problems.
> Any ideas why would sparkContext not see that table?
>
> Problem 2, with HiveContext)
> The other problem with hiveContext is brought up in another email
> chain. Getting
> You must *build Spark with Hive*. Export 'SPARK_HIVE=true'
> The wierd part with this hiveContext problem - is that only happens
> on second time we try to run a paragraph (and any consequative runs).
> First time Zeppelin starts, I can see the same paragraph runs fine.
> Zeppelin somehow corrupts its internal state after first run?
>
> We use Jupyter notebooks without this problems in the same envrionment.
> It might be something how Zeppelin was compiled?
>
> This is how Zeppelin was built:
> /opt/maven/maven-latest/bin/mvn clean package -DskipTests -Pspark-1.6
> -Ppyspark -Dhadoop.version=2.6.0-cdh5.8.3 -Phadoop-2.6 -Pyarn -Pvendor-repo
> -Pscala-2.10 -e
>
> Any help will be greatly appreciated.
> You see, I send this meesage on Thanksgiving, so it's an important problem
> :-)
> Happy Thanksgiving everyone! (if you celebrate it)
>
>
> [1]
>
> Traceback (most recent call last):
> File "/tmp/zeppelin_pyspark-8000586427786928449.py", line 267, in <module>
> raise Exception(traceback.format_exc())
> Exception: Traceback (most recent call last):
> File "/tmp/zeppelin_pyspark-8000586427786928449.py", line 265, in <module>
> exec(code)
> File "<stdin>", line 2, in <module>
> File "/opt/cloudera/parcels/CDH/lib/spark/python/pyspark/sql/context.py",
> line 580, in sql
> return DataFrame(self._ssql_ctx.sql(sqlQuery), self)
> File
> "/opt/cloudera/parcels/CDH/lib/spark/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py",
> line 813, in __call__
> answer, self.gateway_client, self.target_id, self.name)
> File "/opt/cloudera/parcels/CDH/lib/spark/python/pyspark/sql/utils.py",
> line 51, in deco
> raise AnalysisException(s.split(': ', 1)[1], stackTrace)
> AnalysisException: u'Table not found: `marketview`.`spend_dim`;'
>
>
> [2]
>
> ERROR [2016-11-24 00:18:34,579] ({pool-2-thread-5}
> SparkSqlInterpreter.java[interpret]:120) - Invocation target exception
> java.lang.reflect.InvocationTargetException
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at
> org.apache.zeppelin.spark.SparkSqlInterpreter.interpret(SparkSqlInterpreter.java:115)
>         at
> org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:94)
>         at
> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:341)
>         at org.apache.zeppelin.scheduler.Job.run(Job.java:176)
>         at
> org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:139)
>         at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>         at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
>         at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.spark.sql.AnalysisException: Table not found:
> `marketview`.`mv_update_2016q1`;
>         at
> org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:42)
>         at
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.apply(CheckAnalysis.scala:54)
>         at
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.apply(CheckAnalysis.scala:50)
>         at
> org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:121)
>         at
> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:120)
>         at
> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:120)
>         at scala.collection.immutable.List.foreach(List.scala:318)
>         at
> org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:120)
>         at
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.checkAnalysis(CheckAnalysis.scala:50)
>         at
> org.apache.spark.sql.catalyst.analysis.Analyzer.checkAnalysis(Analyzer.scala:44)
>         at
> org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:34)
>         at org.apache.spark.sql.DataFrame.<init>(DataFrame.scala:133)
>         at org.apache.spark.sql.DataFrame$.apply(DataFrame.scala:52)
>         at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:817)
>         ... 16 more
>  INFO [2016-11-24 00:18:34,581] ({pool-2-thread-5}
> SchedulerFactory.java[jobFinished]:137) - Job
> remoteInterpretJob_1479971914506 finished by scheduler
> org.apache.zeppelin.spark.SparkInterpreter866606804
>
>
>
> Thank you,
> Ruslan
>

Re: sqlContext can't find tables, while HiveContext can

Reply via email to