Cool, I'll try that. One more hole in my understanding if you've got the patience for it: If the SQLContext.sql(SQLContext.scala:725) error were thrown from the separate SparkSubmit process, will the JVM stitch together the stack traces across processes? The org.apache.zeppelin.spark.SparkSqlInterpreter.interpret(SparkSqlInterpreter.java:137) call deeper in the stack seems suspicious to me. Would that code only be loaded in the Zeppelin process?
On Fri, Mar 11, 2016 at 5:32 PM, Felix Cheung <felixcheun...@hotmail.com> wrote: > Not from the stack, I think the best way is to run > > jps -v > > You should see a process SparkSubmit if it is running the one from your > spark home. > > > _____________________________ > From: Adam Hull <a...@goodeggs.com> > Sent: Friday, March 11, 2016 2:06 PM > Subject: Re: Spark-sql USING from Zeppelin? > To: <users@zeppelin.incubator.apache.org> > > > Thanks Felix! I've been struggling to wrap my head around *which * > SQLContext. scala file is being executed in this stack. I've set > SPARK_HOME="/usr/local/opt/apache-spark/libexec" in my zeppelin-env.sh > file, but I also see ./spark in my zeppelin install directory. I'd image > if Zeppelin is actually using the spark libs > in /usr/local/opt/apache-spark/libexec (installed by homebrew), then it > should parse the same as > /usr/local/opt/apache-spark/libexec/bin/spark-sql. > > Any ideas which spark jar or installation is being executed in this stack > trace? > > On Fri, Mar 11, 2016 at 1:55 PM, Felix Cheung <felixcheun...@hotmail.com> > wrote: > >> >> As you can see in the stack below, it's just calling SQLContext.sql() >> org.apache.spark.sql.SQLContext.sql(SQLContext.scala:725) at >> >> It is possible this is caused by some issue with line parsing. I will try >> to take a look. >> >> _____________________________ >> From: Adam Hull < a...@goodeggs.com> >> Sent: Friday, March 11, 2016 1:47 PM >> Subject: Spark-sql USING from Zeppelin? >> To: < users@zeppelin.incubator.apache.org> >> >> >> Hi! This whole ecosystem is pretty new to me. >> >> I'd like to pull JSON files from S3 via the spark-sql interpreter. I've >> got code that's working when I run `spark-sql foo.sql` directly, but it >> fails from a Zeppelin notebook. Here's the code: >> >> ``` >> %sql >> >> CREATE TEMPORARY TABLE data >> USING org.apache.spark.sql.json >> OPTIONS ( >> path "s3a://some-bucket/data.json.gz" >> ); >> >> SELECT * FROM data; >> ``` >> >> And here's the Zeppelin error: >> >> cannot recognize input near 'data' 'USING' 'org' in table name; line 2 >> pos 0 >> at org.apache.spark.sql.hive.HiveQl$.createPlan(HiveQl.scala:297) at >> org.apache.spark.sql.hive.ExtendedHiveQlParser$$anonfun$hiveQl$1.apply(ExtendedHiveQlParser.scala:41) >> at >> org.apache.spark.sql.hive.ExtendedHiveQlParser$$anonfun$hiveQl$1.apply(ExtendedHiveQlParser.scala:40) >> at scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:136) at >> scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:135) at >> scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242) >> at >> scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242) >> at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) >> at >> scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1$$anonfun$apply$2.apply(Parsers.scala:254) >> at >> scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1$$anonfun$apply$2.apply(Parsers.scala:254) >> at scala.util.parsing.combinator.Parsers$Failure.append(Parsers.scala:202) >> at >> scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) >> at >> scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) >> at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) >> at >> scala.util.parsing.combinator.Parsers$$anon$2$$anonfun$apply$14.apply(Parsers.scala:891) >> at >> scala.util.parsing.combinator.Parsers$$anon$2$$anonfun$apply$14.apply(Parsers.scala:891) >> at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57) at >> scala.util.parsing.combinator.Parsers$$anon$2.apply(Parsers.scala:890) at >> scala.util.parsing.combinator.PackratParsers$$anon$1.apply(PackratParsers.scala:110) >> at >> org.apache.spark.sql.catalyst.AbstractSparkSQLParser.parse(AbstractSparkSQLParser.scala:34) >> at org.apache.spark.sql.hive.HiveQl$.parseSql(HiveQl.scala:277) at >> org.apache.spark.sql.hive.HiveQLDialect.parse(HiveContext.scala:62) at >> org.apache.spark.sql.SQLContext$$anonfun$3.apply(SQLContext.scala:175) at >> org.apache.spark.sql.SQLContext$$anonfun$3.apply(SQLContext.scala:175) at >> org.apache.spark.sql.SparkSQLParser$$anonfun$org$apache$spark$sql$SparkSQLParser$$others$1.apply(SparkSQLParser.scala:115) >> at >> org.apache.spark.sql.SparkSQLParser$$anonfun$org$apache$spark$sql$SparkSQLParser$$others$1.apply(SparkSQLParser.scala:114) >> at scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:136) at >> scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:135) at >> scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242) >> at >> scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242) >> at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) >> at >> scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1$$anonfun$apply$2.apply(Parsers.scala:254) >> at >> scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1$$anonfun$apply$2.apply(Parsers.scala:254) >> at scala.util.parsing.combinator.Parsers$Failure.append(Parsers.scala:202) >> at >> scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) >> at >> scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254) >> at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) >> at >> scala.util.parsing.combinator.Parsers$$anon$2$$anonfun$apply$14.apply(Parsers.scala:891) >> at >> scala.util.parsing.combinator.Parsers$$anon$2$$anonfun$apply$14.apply(Parsers.scala:891) >> at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57) at >> scala.util.parsing.combinator.Parsers$$anon$2.apply(Parsers.scala:890) at >> scala.util.parsing.combinator.PackratParsers$$anon$1.apply(PackratParsers.scala:110) >> at >> org.apache.spark.sql.catalyst.AbstractSparkSQLParser.parse(AbstractSparkSQLParser.scala:34) >> at org.apache.spark.sql.SQLContext$$anonfun$2.apply(SQLContext.scala:172) >> at org.apache.spark.sql.SQLContext$$anonfun$2.apply(SQLContext.scala:172) >> at >> org.apache.spark.sql.execution.datasources.DDLParser.parse(DDLParser.scala:42) >> at org.apache.spark.sql.SQLContext.parseSql(SQLContext.scala:195) at >> org.apache.spark.sql.hive.HiveContext.parseSql(HiveContext.scala:279) at >> org.apache.spark.sql.SQLContext.sql(SQLContext.scala:725) at >> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) >> at >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >> at java.lang.reflect.Method.invoke(Method.java:497) at org.apache. >> zeppelin.spark.SparkSqlInterpreter.interpret(SparkSqlInterpreter.java:137) >> at >> org.apache.zeppelin.interpreter.ClassloaderInterpreter.interpret(ClassloaderInterpreter.java:57) >> at >> org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:93) >> at >> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:331) >> at org.apache.zeppelin.scheduler.Job.run(Job.java:171) at org.apache. >> zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:139) at >> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at >> java.util.concurrent.FutureTask.run(FutureTask.java:266) at >> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) >> at >> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) >> at java.lang.Thread.run(Thread.java:745) >> >> Seems to me that Zeppelin is using a different Spark SQL parser. I've >> checked via the Spark UI that both `spark-sql` and Zeppelin are using >> Spark 1.5.1, and Hadoop 2.6.0. I'm using Zeppelin 0.6. >> >> Any suggestions where to look next? I see hive in that stack trace... >> >> >> > > >