Thanks Nihal for the help, I think I can upgrade to Spark 1.4 next week,
hopefully that works!

On Wed, Jun 17, 2015 at 9:11 PM, Nihal Bhagchandani <
nihal_bhagchand...@yahoo.com> wrote:

> Hi Su,
>
> I have already switched to spark 1.4.0, from spark 1.3.0 the concept of
> DataFrame introduced, which gives more flexibility to manage data in
> different formats.
> what is the possibility that you move your zeppelin to spark 1.4.0?
> you can build your zeppelin by running the following command
>
> $ sudo mvn clean package -Pspark-1.4 -Dhadoop.version=2.2.0 -Phadoop-2.2
> -DskipTests
>
> for more details : apache/incubator-zeppelin
> <https://github.com/apache/incubator-zeppelin>
>
>
> [image: image] <https://github.com/apache/incubator-zeppelin>
>
>
>
>
>
> apache/incubator-zeppelin <https://github.com/apache/incubator-zeppelin>
> incubator-zeppelin - Mirror of Apache Zeppelin (Incubating)
> View on github.com <https://github.com/apache/incubator-zeppelin>
> Preview by Yahoo
>
>
> Regards
> Nihal
>
>
>
>
>
>   On Wednesday, 17 June 2015 1:41 PM, Su She <suhsheka...@gmail.com>
> wrote:
>
>
> Couple clarifications:
>
> 1) Was able to use sqlContext.sql when using "programitically specifying
> schema" in this documentation:
> https://spark.apache.org/docs/1.2.0/sql-programming-guide.html
>
> 2) Here is the notebook I ran using this, i was able to run sql commands,
> but not the %sql commands
>
> *import sys.process._*
> *import org.apache.spark.sql._*
>
> *// sc is an existing SparkContext.*
> *val sqlContext = new org.apache.spark.sql.SQLContext(sc)*
>
> *val wiki = sc.textFile("data/wiki.csv")*
>
> *val schemaString = "date language title pagecounts"*
>
> *import org.apache.spark.sql._*
>
> *val schema =StructType(schemaString.split(" ").map(fieldName =>
> StructField(fieldName, StringType, true)))*
>
> *val rowRDD = wiki.map(_.split(" ")).map(line => Row(line(0).substring(0,
> 8),line(1), line(2), line(3)))*
>
> *val wikiSchemaRDD = sqlContext.applySchema(rowRDD, schema)*
>
> *wikiSchemaRDD.registerTempTable("people")*
>
> *val results = sqlContext.sql("SELECT * FROM people")*
>
> *results.take(10)*
>
> so results returns the correct results
>
> however when I try:
>
> %sql select date from people
>
> java.lang.reflect.InvocationTargetException
>
> Hope this adds clarity to my issues, thank you
>
> Best,
>
> Su
>
> On Wed, Jun 17, 2015 at 12:47 AM, Su She <suhsheka...@gmail.com> wrote:
>
> Hello Nihal,
>
> This is what I got:
>
> sc.version: 1.2.1
>
> I couldn't get the name of the tables:
>
> I tried it with this line in the code as well as commented out:val
> sqlContext = new org.apache.spark.sql.SQLContext(sc)
>
> error: value tableNames is not a member of org.apache.spark.sql.SQLContext
> sqlContext.tableNames().foreach(println)
>
> However, I don't think the table is registered with sqlContext. For
> example, if you check the Zeppelin tutorial, you cannot run:
>
> val results = sqlContext.sql("select * from bank") //error: table bank not
> found, however you can run %sql select * from bank
>
> When I followed this:
> https://spark.apache.org/docs/1.2.0/sql-programming-guide.html, I was
> able to use sqlContext.sql to query results, but I couldn't use %sql in
> that case :(
>
> Thank again for the help and please let me know how I can proceed :)
>
> Thanks,
>
> Su
>
> On Wed, Jun 17, 2015 at 12:10 AM, Nihal Bhagchandani <
> nihal_bhagchand...@yahoo.com> wrote:
>
> Hi Su,
>
> could you please check if your bank1 get register as table?
>
> -Nihal
>
>
>
>   On Wednesday, 17 June 2015 11:54 AM, Su She <suhsheka...@gmail.com>
> wrote:
>
>
> Thanks Nihal for the suggestion, I kinda realized what the problem is.I
> realized that zeppelin will use the hivecontext unless it is set to false.
> So I set it to false in env.sh and the 3 charts at the bottom of the
> tutorial work as then SQLContext becomes the default instead of HiveContext.
>
> However, I am having trouble running my own version of this notebook.
>
> As I was having problems with the notebook, I c/p the code from the
> tutorial and instead of bank-full.csv I used wiki.csv. I followed the same
> format as the tutorial and i kept on getting errors. I kept on trying to
> simplify code and this is where I ended up with:
>
> *PARA1:*
>
> val wiki = bankText.map(s => s.split(" ")).map(
>     s => Bank(s(3).toInt,
>             "secondary",
>             "third",
>             "fourth",
>             s(4).toInt
>         )
> )
> wiki.registerTempTable("bank1")
>
> *PARA2:*
>
> wiki.take(10)
>
> *Result:*
>
> res213: Array[Bank] = Array(Bank(2,secondary,third,fourth,9980),
> Bank(1,secondary,third,fourth,465), Bank(1,secondary,third,fourth,16086),
>
> COMPARE THIS TO bank.take(10) from the tutorial:
>
> res188: Array[Bank] = Array(Bank(58,management,married,tertiary,2143),
> Bank(44,technician,single,secondary,29),
> Bank(33,entrepreneur,married,secondary,2),
> Bank(47,blue-collar,married,unknown,1506), Bank(33,unknown,single,unknown,1)
>
> *PARA3:*
>
> *%sql *
> *select age, count(1) value *
> *from bank1*
> *where age < 33 *
> *group by age *
> *order by age*
>
>
> *java.lang.reflect.InvocationTargetException*
>
> I'm not sure what I'm doing wrong. The new array has the same data format,
> but different values. It doesn't seem like there are any extra spaces and
> such.
>
> On Tue, Jun 16, 2015 at 10:55 PM, Nihal Bhagchandani <
> nihal_bhagchand...@yahoo.com> wrote:
>
> Hi Su,
>
> it seems like your table is not getting registered.
>
> can you try the following:
> if you have used the following line
> "val sqlContext = new org.apache.spark.sql.SQLContext(sc)"
>
> I would suggest to comment it, as zeppelin creates sqlContext byDefault.
>
> if you didnt have the above line do write following lines at the end of
> paragraph and run:
>
> sqlContext.tableNames().foreach(println) // this should print all the
> tables register with current sqlContext on output section.
>
> you can also check you spark version by running following command
> sc.version
>
> -Nihal
>
>
>
>
>
>   On Wednesday, 17 June 2015 10:01 AM, Su She <suhsheka...@gmail.com>
> wrote:
>
>
> Hello,
>
> excited to get Zeppelin up and running!
>
> 1) I was not able to go through the Zeppelin tutorial notebook. I did
> remove toDF which made that paragraph work, but the 3 graphs at the
> bottom all returned the InvocationTargetException
>
> 2) From a couple other threads on the archive it seems like this error
> means that it isn't connected to Spark:
>
> a) I am running it locally
>
> b) I created a new notebook and I was able to run spark commands and
> create a table using sqlContext and query it, so this means that it is
> connected to Spark right?
>
> c) I am able to do:
>
> val results = sqlContext.sql("SELECT * FROM wiki")
>
> but i can't do:
>
> %sql select pagecounts, count(1) from wiki
>
> 3) I am a bit confused on how to get the visualizations. I understand
> the %table command, but do I use %table when running Spark jobs or do
> I use %sql?
>
> Thanks!
>
> -Su
>
>
>
>
>
>
>
>
>
>

Reply via email to