Any updates on this? Or perhaps tutorials that successfully integrate spark-csv into Zeppelin? If I can rule out the code as the problem I can start looking into the install to see what's going wrong.
Thanks, Ryan On Tue, Sep 29, 2015 at 9:09 AM, Ryan <freelanceflashga...@gmail.com> wrote: > Hi Alex, > > Thank you for getting back to me! > > The tutorial code was a bit confusing and made it seem like sqlContext was > the proper variable to use: > // Zeppelin creates and injects sc (SparkContext) and sqlContext > (HiveContext or SqlContext) > > I tried as you mentioned, but am still getting similar errors. Here is the > code I tried: > > %dep > z.reset() > z.load("com.databricks:spark-csv_2.10:1.2.0") > > %spark > val crimeData = "hdfs:// > sandbox.hortonworks.com:8020/user/root/data/crime_incidents_2013_CSV.csv" > sqlc.load("com.databricks.spark.csv", Map("path" -> crimeData, "header" -> > "true")).registerTempTable("crimes") > > The %spark interpreter is binded in the settings. I clicked save again to > make sure, then ran it again. I am getting this error: > > <console>:17: error: not found: value sqlc > sqlc.load("com.databricks.spark.csv", Map("path" -> crimeData, "header" -> > "true")).registerTempTable("crimes") ^ <console>:13: error: not found: > value % %spark > Could it be something to do with my Zeppelin installation? The tutorial > code ran without any issues though. > > Thanks! > Ryan > > > > On Mon, Sep 28, 2015 at 5:07 PM, Alexander Bezzubov <b...@apache.org> > wrote: > >> Hi, >> >> thank you for your interested in Zeppelin! >> >> Couple of things I noticed: as you probably already know , %dep and >> %spark parts should always be in separate paragraphs. >> >> %spark already exposes sql context though `sqlc` variable, so you better >> use sqlc.load("...") instead. >> >> And of course to be able to use %spark interpreter in the notebook, you >> need to make sure you have it binded (cog button, on the top right) >> >> Hope this helps! >> >> -- >> Kind regards, >> Alex >> >> >> On Mon, Sep 28, 2015 at 4:29 PM, Ryan <freelanceflashga...@gmail.com> >> wrote: >> >>> Hi, >>> >>> In a Zeppelin notebook, I am trying to load a csv using the spark-csv >>> package by databricks. I am using the Hortonworks sandbox to run Zeppelin >>> on. Unfortunately, the methods I have been trying have not been working. >>> >>> My latest attempt is: >>> %dep >>> z.load("com.databricks:spark-csv_2.10:1.2.0") >>> %spark >>> val crimeData = "hdfs:// >>> sandbox.hortonworks.com:8020/user/root/data/crime_incidents_2013_CSV.csv >>> " >>> sqlContext.load("hdfs:// >>> sandbox.hortonworks.com:8020/user/root/data/crime_incidents_2013_CSV.csv", >>> Map("path" -> crimeData, "header" -> "true")).registerTempTable("crimes") >>> >>> This is the error I receive: >>> <console>:16: error: not found: value sqlContext sqlContext.load("hdfs:// >>> sandbox.hortonworks.com:8020/user/root/data/crime_incidents_2013_CSV.csv", >>> Map("path" -> crimeData, "header" -> "true")).registerTempTable("crimes") ^ >>> <console>:12: error: not found: value % %spark ^ >>> Thank you for any help in advance, >>> Ryan >>> >> >> >