Couple clarifications:

1) Was able to use sqlContext.sql when using "programitically specifying
schema" in this documentation:
https://spark.apache.org/docs/1.2.0/sql-programming-guide.html

2) Here is the notebook I ran using this, i was able to run sql commands,
but not the %sql commands

*import sys.process._*
*import org.apache.spark.sql._*

*// sc is an existing SparkContext.*
*val sqlContext = new org.apache.spark.sql.SQLContext(sc)*

*val wiki = sc.textFile("data/wiki.csv")*

*val schemaString = "date language title pagecounts"*

*import org.apache.spark.sql._*

*val schema =StructType(schemaString.split(" ").map(fieldName =>
StructField(fieldName, StringType, true)))*

*val rowRDD = wiki.map(_.split(" ")).map(line => Row(line(0).substring(0,
8),line(1), line(2), line(3)))*

*val wikiSchemaRDD = sqlContext.applySchema(rowRDD, schema)*

*wikiSchemaRDD.registerTempTable("people")*

*val results = sqlContext.sql("SELECT * FROM people")*

*results.take(10)*

so results returns the correct results

however when I try:

%sql select date from people

java.lang.reflect.InvocationTargetException

Hope this adds clarity to my issues, thank you

Best,

Su

On Wed, Jun 17, 2015 at 12:47 AM, Su She <suhsheka...@gmail.com> wrote:

> Hello Nihal,
>
> This is what I got:
>
> sc.version: 1.2.1
>
> I couldn't get the name of the tables:
>
> I tried it with this line in the code as well as commented out:val
> sqlContext = new org.apache.spark.sql.SQLContext(sc)
>
> error: value tableNames is not a member of org.apache.spark.sql.SQLContext
> sqlContext.tableNames().foreach(println)
>
> However, I don't think the table is registered with sqlContext. For
> example, if you check the Zeppelin tutorial, you cannot run:
>
> val results = sqlContext.sql("select * from bank") //error: table bank not
> found, however you can run %sql select * from bank
>
> When I followed this:
> https://spark.apache.org/docs/1.2.0/sql-programming-guide.html, I was
> able to use sqlContext.sql to query results, but I couldn't use %sql in
> that case :(
>
> Thank again for the help and please let me know how I can proceed :)
>
> Thanks,
>
> Su
>
> On Wed, Jun 17, 2015 at 12:10 AM, Nihal Bhagchandani <
> nihal_bhagchand...@yahoo.com> wrote:
>
>> Hi Su,
>>
>> could you please check if your bank1 get register as table?
>>
>> -Nihal
>>
>>
>>
>>   On Wednesday, 17 June 2015 11:54 AM, Su She <suhsheka...@gmail.com>
>> wrote:
>>
>>
>> Thanks Nihal for the suggestion, I kinda realized what the problem is.I
>> realized that zeppelin will use the hivecontext unless it is set to false.
>> So I set it to false in env.sh and the 3 charts at the bottom of the
>> tutorial work as then SQLContext becomes the default instead of HiveContext.
>>
>> However, I am having trouble running my own version of this notebook.
>>
>> As I was having problems with the notebook, I c/p the code from the
>> tutorial and instead of bank-full.csv I used wiki.csv. I followed the same
>> format as the tutorial and i kept on getting errors. I kept on trying to
>> simplify code and this is where I ended up with:
>>
>> *PARA1:*
>>
>> val wiki = bankText.map(s => s.split(" ")).map(
>>     s => Bank(s(3).toInt,
>>             "secondary",
>>             "third",
>>             "fourth",
>>             s(4).toInt
>>         )
>> )
>> wiki.registerTempTable("bank1")
>>
>> *PARA2:*
>>
>> wiki.take(10)
>>
>> *Result:*
>>
>> res213: Array[Bank] = Array(Bank(2,secondary,third,fourth,9980),
>> Bank(1,secondary,third,fourth,465), Bank(1,secondary,third,fourth,16086),
>>
>> COMPARE THIS TO bank.take(10) from the tutorial:
>>
>> res188: Array[Bank] = Array(Bank(58,management,married,tertiary,2143),
>> Bank(44,technician,single,secondary,29),
>> Bank(33,entrepreneur,married,secondary,2),
>> Bank(47,blue-collar,married,unknown,1506), Bank(33,unknown,single,unknown,1)
>>
>> *PARA3:*
>>
>> *%sql *
>> *select age, count(1) value *
>> *from bank1*
>> *where age < 33 *
>> *group by age *
>> *order by age*
>>
>>
>> *java.lang.reflect.InvocationTargetException*
>>
>> I'm not sure what I'm doing wrong. The new array has the same data
>> format, but different values. It doesn't seem like there are any extra
>> spaces and such.
>>
>> On Tue, Jun 16, 2015 at 10:55 PM, Nihal Bhagchandani <
>> nihal_bhagchand...@yahoo.com> wrote:
>>
>> Hi Su,
>>
>> it seems like your table is not getting registered.
>>
>> can you try the following:
>> if you have used the following line
>> "val sqlContext = new org.apache.spark.sql.SQLContext(sc)"
>>
>> I would suggest to comment it, as zeppelin creates sqlContext byDefault.
>>
>> if you didnt have the above line do write following lines at the end of
>> paragraph and run:
>>
>> sqlContext.tableNames().foreach(println) // this should print all the
>> tables register with current sqlContext on output section.
>>
>> you can also check you spark version by running following command
>> sc.version
>>
>> -Nihal
>>
>>
>>
>>
>>
>>   On Wednesday, 17 June 2015 10:01 AM, Su She <suhsheka...@gmail.com>
>> wrote:
>>
>>
>> Hello,
>>
>> excited to get Zeppelin up and running!
>>
>> 1) I was not able to go through the Zeppelin tutorial notebook. I did
>> remove toDF which made that paragraph work, but the 3 graphs at the
>> bottom all returned the InvocationTargetException
>>
>> 2) From a couple other threads on the archive it seems like this error
>> means that it isn't connected to Spark:
>>
>> a) I am running it locally
>>
>> b) I created a new notebook and I was able to run spark commands and
>> create a table using sqlContext and query it, so this means that it is
>> connected to Spark right?
>>
>> c) I am able to do:
>>
>> val results = sqlContext.sql("SELECT * FROM wiki")
>>
>> but i can't do:
>>
>> %sql select pagecounts, count(1) from wiki
>>
>> 3) I am a bit confused on how to get the visualizations. I understand
>> the %table command, but do I use %table when running Spark jobs or do
>> I use %sql?
>>
>> Thanks!
>>
>> -Su
>>
>>
>>
>>
>>
>>
>

Reply via email to