Thanks.

 

I tried to access Hive table via JDBC (it works) through sqlContext

 

 

scala> val sqlContext = new org.apache.spark.sql.SQLContext(sc)

sqlContext: org.apache.spark.sql.SQLContext = 
org.apache.spark.sql.SQLContext@4f60415b

 

scala> val s = sqlContext.load("jdbc",

     | Map("url" -> "jdbc:hive2://rhes564:10010/oraclehadoop",

     | "dbtable" -> "SALES",

     | "user" -> "hduser",

     | "password" -> "xxx"))

warning: there were 1 deprecation warning(s); re-run with -deprecation for 
details

java.sql.SQLException: Method not supported

 

In general one should expect this to work

 

The attraction of Spark is to cache these tables in memory via registering them 
as temporary tables and do the queries there.

 

 

Dr Mich Talebzadeh

 

LinkedIn  
https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw

 

http://talebzadehmich.wordpress.com <http://talebzadehmich.wordpress.com/> 

 

NOTE: The information in this email is proprietary and confidential. This 
message is for the designated recipient only, if you are not the intended 
recipient, you should destroy it immediately. Any information in this message 
shall not be understood as given or endorsed by Peridale Technology Ltd, its 
subsidiaries or their employees, unless expressly so stated. It is the 
responsibility of the recipient to ensure that this email is virus free, 
therefore neither Peridale Technology Ltd, its subsidiaries nor their employees 
accept any responsibility.

 

 

From: ayan guha [mailto:guha.a...@gmail.com] 
Sent: 14 February 2016 21:07
To: Mich Talebzadeh <m...@peridale.co.uk>
Cc: user <user@spark.apache.org>
Subject: Re: Trying to join a registered Hive table as temporary with two 
Oracle tables registered as temporary in Spark

 

Why can't you use the jdbc in hive context? I don't think sharing data across 
contexts are allowed. 

On 15 Feb 2016 07:22, "Mich Talebzadeh" <m...@peridale.co.uk 
<mailto:m...@peridale.co.uk> > wrote:

I am intending to get a table from Hive and register it as temporary table in 
Spark.

 

I have created contexts for both Hive and Spark as below

 

val sqlContext = new org.apache.spark.sql.SQLContext(sc)

val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)

//

 

I get the Hive table as below using HiveContext

 

//Get the FACT table from Hive

//

var s = hiveContext.sql("SELECT AMOUNT_SOLD, TIME_ID, CHANNEL_ID FROM 
oraclehadoop.sales")

 

s.registerTempTable("t_s")

 

This works fine using HiveContext

 

scala> hiveContext.sql("select count(1) from t_s").collect.foreach(println)

[4991761]

 

Now I use JDBC to get data from two Oracle tables and registar them as 
temporary tables using sqlContext

 

val c = sqlContext.load("jdbc",

Map("url" -> "jdbc:oracle:thin:@rhes564:1521:mydb",

"dbtable" -> "(SELECT to_char(CHANNEL_ID) AS CHANNEL_ID, CHANNEL_DESC FROM 
sh.channels)",

"user" -> "sh",

"password" -> "xxx"))

 

val t = sqlContext.load("jdbc",

Map("url" -> "jdbc:oracle:thin:@rhes564:1521:mydb",

"dbtable" -> "(SELECT to_char(TIME_ID) AS TIME_ID, CALENDAR_MONTH_DESC FROM 
sh.times)",

"user" -> "sh",

"password" -> "sxxx"))

 

And register them as temporary tables

 

c.registerTempTable("t_c")

t.registerTempTable("t_t")

//

 

Now trying to do SQL on three tables using sqlContext. However it cannot see 
the hive table 

 

var sqltext : String = ""

sqltext = """

SELECT rs.Month, rs.SalesChannel, round(TotalSales,2)

FROM

(

SELECT t_t.CALENDAR_MONTH_DESC AS Month, t_c.CHANNEL_DESC AS SalesChannel, 
SUM(t_s.AMOUNT_SOLD) AS TotalSales

FROM t_s, t_t, t_c

WHERE t_s.TIME_ID = t_t.TIME_ID

AND   t_s.CHANNEL_ID = t_c.CHANNEL_ID

GROUP BY t_t.CALENDAR_MONTH_DESC, t_c.CHANNEL_DESC

ORDER by t_t.CALENDAR_MONTH_DESC, t_c.CHANNEL_DESC

) rs

LIMIT 10

 

 

sqlContext.sql(sqltext).collect.foreach(println)

 

org.apache.spark.sql.AnalysisException: no such table t_s; line 5 pos 10

 

I guess this is due to two  different Data Frame used. Is there any solution? 
For example can I transorm from HiveContext to sqlContext?

 

Thanks

 

Dr Mich Talebzadeh

 

LinkedIn  
https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw

 

http://talebzadehmich.wordpress.com <http://talebzadehmich.wordpress.com/> 

 

NOTE: The information in this email is proprietary and confidential. This 
message is for the designated recipient only, if you are not the intended 
recipient, you should destroy it immediately. Any information in this message 
shall not be understood as given or endorsed by Peridale Technology Ltd, its 
subsidiaries or their employees, unless expressly so stated. It is the 
responsibility of the recipient to ensure that this email is virus free, 
therefore neither Peridale Technology Ltd, its subsidiaries nor their employees 
accept any responsibility.

 

 

Reply via email to