Re: Spark-sql USING from Zeppelin?

Felix Cheung Fri, 11 Mar 2016 17:33:07 -0800

Not from the stack, I think the best way is to run
jps -v
You should see a process SparkSubmit if it is running the one from your spark 
home.




    _____________________________
From: Adam Hull <[email protected]>
Sent: Friday, March 11, 2016 2:06 PM
Subject: Re: Spark-sql USING from Zeppelin?
To:  <[email protected]>


       Thanks Felix! I've been struggling to wrap my head around    which    
SQLContext.   scala file is being executed in this stack.  I've set 
SPARK_HOME="/usr/local/opt/apache-spark/libexec" in my zeppelin-env.sh file, 
but I also see ./spark in my zeppelin install directory.  I'd image if Zeppelin 
is actually using the spark libs in /usr/local/opt/apache-spark/libexec 
(installed by homebrew), then it should parse the same as    
/usr/local/opt/apache-spark/libexec/bin/spark-sql.   
   
       Any ideas which spark jar or installation is being executed in this 
stack trace?          
       On Fri, Mar 11, 2016 at 1:55 PM, Felix Cheung     
<[email protected]> wrote:    
                       
As you can see in the stack below, it's just calling SQLContext.sql()           
        org.apache.spark.sql.SQLContext.sql(SQLContext.scala:725) at            
        
                   It is possible this is caused by some issue with line 
parsing. I will try to take a look.       
              
                       _____________________________      
From: Adam Hull <      [email protected]>      
Sent: Friday, March 11, 2016 1:47 PM      
Subject: Spark-sql USING from Zeppelin?      
To: <      [email protected]>      
      
      
               Hi! This whole ecosystem is pretty new to me.                   
                            I'd like to pull JSON files from S3 via the 
spark-sql interpreter.  I've got code that's working when I run `spark-sql 
foo.sql` directly, but it fails from a           Zeppelin notebook.  Here's the 
code:                            
                            ```                            %sql                 
           
                                       CREATE TEMPORARY TABLE data              
                 USING org.apache.spark.sql.json                               
OPTIONS (                                 path "s3a://some-bucket/data.json.gz" 
                              );                               
                               SELECT * FROM data;                              
        ```                            
                            And here's the           Zeppelin error:            
                
                            cannot recognize input near 'data' 'USING' 'org' in 
table name; line 2 pos 0          
                            at 
org.apache.spark.sql.hive.HiveQl$.createPlan(HiveQl.scala:297) at 
org.apache.spark.sql.hive.ExtendedHiveQlParser$$anonfun$hiveQl$1.apply(ExtendedHiveQlParser.scala:41)
 at 
org.apache.spark.sql.hive.ExtendedHiveQlParser$$anonfun$hiveQl$1.apply(ExtendedHiveQlParser.scala:40)
 at scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:136) at 
scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:135) at 
scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242)
 at 
scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242)
 at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at 
scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1$$anonfun$apply$2.apply(Parsers.scala:254)
 at 
scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1$$anonfun$apply$2.apply(Parsers.scala:254)
 at scala.util.parsing.combinator.Parsers$Failure.append(Parsers.scala:202) at 
scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254)
 at 
scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254)
 at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at 
scala.util.parsing.combinator.Parsers$$anon$2$$anonfun$apply$14.apply(Parsers.scala:891)
 at 
scala.util.parsing.combinator.Parsers$$anon$2$$anonfun$apply$14.apply(Parsers.scala:891)
 at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57) at 
scala.util.parsing.combinator.Parsers$$anon$2.apply(Parsers.scala:890) at 
scala.util.parsing.combinator.PackratParsers$$anon$1.apply(PackratParsers.scala:110)
 at 
org.apache.spark.sql.catalyst.AbstractSparkSQLParser.parse(AbstractSparkSQLParser.scala:34)
 at org.apache.spark.sql.hive.HiveQl$.parseSql(HiveQl.scala:277) at 
org.apache.spark.sql.hive.HiveQLDialect.parse(HiveContext.scala:62) at 
org.apache.spark.sql.SQLContext$$anonfun$3.apply(SQLContext.scala:175) at 
org.apache.spark.sql.SQLContext$$anonfun$3.apply(SQLContext.scala:175) at 
org.apache.spark.sql.SparkSQLParser$$anonfun$org$apache$spark$sql$SparkSQLParser$$others$1.apply(SparkSQLParser.scala:115)
 at 
org.apache.spark.sql.SparkSQLParser$$anonfun$org$apache$spark$sql$SparkSQLParser$$others$1.apply(SparkSQLParser.scala:114)
 at scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:136) at 
scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:135) at 
scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242)
 at 
scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242)
 at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at 
scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1$$anonfun$apply$2.apply(Parsers.scala:254)
 at 
scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1$$anonfun$apply$2.apply(Parsers.scala:254)
 at scala.util.parsing.combinator.Parsers$Failure.append(Parsers.scala:202) at 
scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254)
 at 
scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254)
 at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222) at 
scala.util.parsing.combinator.Parsers$$anon$2$$anonfun$apply$14.apply(Parsers.scala:891)
 at 
scala.util.parsing.combinator.Parsers$$anon$2$$anonfun$apply$14.apply(Parsers.scala:891)
 at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57) at 
scala.util.parsing.combinator.Parsers$$anon$2.apply(Parsers.scala:890) at 
scala.util.parsing.combinator.PackratParsers$$anon$1.apply(PackratParsers.scala:110)
 at 
org.apache.spark.sql.catalyst.AbstractSparkSQLParser.parse(AbstractSparkSQLParser.scala:34)
 at org.apache.spark.sql.SQLContext$$anonfun$2.apply(SQLContext.scala:172) at 
org.apache.spark.sql.SQLContext$$anonfun$2.apply(SQLContext.scala:172) at 
org.apache.spark.sql.execution.datasources.DDLParser.parse(DDLParser.scala:42) 
at org.apache.spark.sql.SQLContext.parseSql(SQLContext.scala:195) at 
org.apache.spark.sql.hive.HiveContext.parseSql(HiveContext.scala:279) at 
org.apache.spark.sql.SQLContext.sql(SQLContext.scala:725) at 
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:497) at 
org.apache.zeppelin.spark.SparkSqlInterpreter.interpret(SparkSqlInterpreter.java:137)
 at 
org.apache.zeppelin.interpreter.ClassloaderInterpreter.interpret(ClassloaderInterpreter.java:57)
 at 
org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:93)
 at 
org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:331)
 at org.apache.zeppelin.scheduler.Job.run(Job.java:171) at 
org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:139) at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at 
java.util.concurrent.FutureTask.run(FutureTask.java:266) at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
 at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
at java.lang.Thread.run(Thread.java:745)          
                            
                 Seems to me that         Zeppelin         is using a different 
Spark SQL parser.  I've checked via the Spark UI that both `spark-sql` and      
   Zeppelin        are using Spark 1.5.1, and Hadoop 2.6.0.  I'm using         
Zeppelin         0.6.                          
                            Any suggestions where to look next?  I see hive in 
that stack trace...

Re: Spark-sql USING from Zeppelin?

Reply via email to