Re: Trying to join a registered Hive table as temporary with two Oracle tables registered as temporary in Spark

ayan guha Sun, 14 Feb 2016 13:07:27 -0800

Why can't you use the jdbc in hive context? I don't think sharing data
across contexts are allowed.
On 15 Feb 2016 07:22, "Mich Talebzadeh" <m...@peridale.co.uk> wrote:


> I am intending to get a table from Hive and register it as temporary table
> in Spark.
>
>
>
> I have created contexts for both Hive and Spark as below
>
>
>
> val sqlContext = new org.apache.spark.sql.SQLContext(sc)
>
> val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
>
> //
>
>
>
> I get the Hive table as below using HiveContext
>
>
>
> //Get the FACT table from Hive
>
> //
>
> var s = hiveContext.sql("SELECT AMOUNT_SOLD, TIME_ID, CHANNEL_ID FROM
> oraclehadoop.sales")
>
>
>
> s.registerTempTable("t_s")
>
>
>
> This works fine using HiveContext
>
>
>
> scala> hiveContext.sql("select count(1) from
> t_s").collect.foreach(println)
>
> [4991761]
>
>
>
> Now I use JDBC to get data from two Oracle tables and registar them as
> temporary tables using sqlContext
>
>
>
> val c = sqlContext.load("jdbc",
>
> Map("url" -> "jdbc:oracle:thin:@rhes564:1521:mydb",
>
> "dbtable" -> "(SELECT to_char(CHANNEL_ID) AS CHANNEL_ID, CHANNEL_DESC FROM
> sh.channels)",
>
> "user" -> "sh",
>
> "password" -> "xxx"))
>
>
>
> val t = sqlContext.load("jdbc",
>
> Map("url" -> "jdbc:oracle:thin:@rhes564:1521:mydb",
>
> "dbtable" -> "(SELECT to_char(TIME_ID) AS TIME_ID, CALENDAR_MONTH_DESC
> FROM sh.times)",
>
> "user" -> "sh",
>
> "password" -> "sxxx"))
>
>
>
> And register them as temporary tables
>
>
>
> c.registerTempTable("t_c")
>
> t.registerTempTable("t_t")
>
> //
>
>
>
> Now trying to do SQL on three tables using sqlContext. However it cannot
> see the hive table
>
>
>
> var sqltext : String = ""
>
> sqltext = """
>
> SELECT rs.Month, rs.SalesChannel, round(TotalSales,2)
>
> FROM
>
> (
>
> SELECT t_t.CALENDAR_MONTH_DESC AS Month, t_c.CHANNEL_DESC AS SalesChannel,
> SUM(t_s.AMOUNT_SOLD) AS TotalSales
>
> FROM t_s, t_t, t_c
>
> WHERE t_s.TIME_ID = t_t.TIME_ID
>
> AND   t_s.CHANNEL_ID = t_c.CHANNEL_ID
>
> GROUP BY t_t.CALENDAR_MONTH_DESC, t_c.CHANNEL_DESC
>
> ORDER by t_t.CALENDAR_MONTH_DESC, t_c.CHANNEL_DESC
>
> ) rs
>
> LIMIT 10
>
>
>
>
>
> sqlContext.sql(sqltext).collect.foreach(println)
>
>
>
> *org.apache.spark.sql.AnalysisException: no such table t_s; line 5 pos 10*
>
>
>
> I guess this is due to two  different Data Frame used. Is there any
> solution? For example can I transorm from HiveContext to sqlContext?
>
>
>
> Thanks
>
>
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * 
> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>
>
>
> http://talebzadehmich.wordpress.com
>
>
>
> NOTE: The information in this email is proprietary and confidential. This
> message is for the designated recipient only, if you are not the intended
> recipient, you should destroy it immediately. Any information in this
> message shall not be understood as given or endorsed by Peridale Technology
> Ltd, its subsidiaries or their employees, unless expressly so stated. It is
> the responsibility of the recipient to ensure that this email is virus
> free, therefore neither Peridale Technology Ltd, its subsidiaries nor their
> employees accept any responsibility.
>
>
>
>
>

Re: Trying to join a registered Hive table as temporary with two Oracle tables registered as temporary in Spark

Reply via email to