Re: Issues in Zeppelin 0.6.0

Jeff Zhang Wed, 24 Aug 2016 05:56:46 -0700

All right, then this is the real code you executed, So what kind of error
do you get ? And have you run the sql in spark 1.6 ? As I suspect it is
morely a spark issue.


On Wed, Aug 24, 2016 at 7:45 PM, Sundararajan, Pranav <
pranav.sundarara...@pfizer.com> wrote:

> No Jeff, that is not the issue here.
>
>
>
> For instance, in my previous zeppelin version, I was calling  %sql and
> then writing a create table statement below it, reading data from the spark
> tables I had created in that particular session.
>
>
>
> It reads something like this:
>
>
>
> *%sql*
>
>
>
>
>
> *CREATE TABLE SDS_DIAG_RESTRICTED_3YRS as*
>
>
>
> *select a.PTID, a.SDS_TERM, c.DIAGNOSIS_CD, c.DIAGNOSIS_CD_TYPE from *
>
>
>
>
>
> *(SELECT PTID, SDS_TERM from *
>
> *(*
>
> *SELECT PTID, SDS_TERM ,count(distinct NOTE_DATE) AS counted FROM
> sds201320142015*
>
>
>
> *where lower(SDS_TERM) rlike (${2. Terms of Interest='nash|pain'}) and
> substring(NOTE_DATE,7,10) rlike (${4. Years of Interest='2014'})*
>
>
>
> *GROUP BY PTID, SDS_TERM*
>
>
>
> *) AS B*
>
>
>
> *WHERE counted >=${3. Min Nr of Notes=2} ) as a*
>
>
>
> *inner join *
>
>
>
> *(select PTID,DIAGNOSIS_CD,DIAGNOSIS_CD_TYPE*
>
>
>
> *from diag201320142015*
>
>
>
> *where lower(DIAGNOSIS_CD) rlike (${1. Diagnosis codes='8820'})*
>
>
>
> *) as c *
>
>
>
> *on a.PTID=c.PTID*
>
>
>
>
>
> *group by a.PTID, a.SDS_TERM, c.DIAGNOSIS_CD, c.DIAGNOSIS_CD_TYPE*
>
>
>
> The yellow highlighted portion  of the code, i.e *sds201320142015* is the
> temporary spark table which I am trying to use to create my Hive table
> which is *SDS_DIAG_RESTRICTED_3YRS**. *So the main issue here is this
> code snippet which was working on Zeppelin 0.5.6, is not working on
> Zeppelin 0.6.0, in spite of carrying the same interpreter settings as I’ve
> told you before.
>
>
>
> I hope that you have a better understanding of the problem we are trying
> to resolve here now.
>
>
>
>
>
> Regards,
>
>
>
>
>
>
>
> [image: Description: cid:image001.png@01D1C173.60CDE0C0]
>
> *Pranav Sundararajan| On assignment to **P**fizerWorks*
>
> Cell*: +91 9008412366 <%2B91%209008412366>*
>
> *Email**: **pfizerwor...@pfizer.com <pfizerwor...@pfizer.com>**; 
> **pranav.sundarara...@pfizer.com
> <pranav.sundarara...@pfizer.com>*
>
> *Website**: **http://pfizerWorks.pfizer.com
> <http://pfizerworks.pfizer.com/>*
>
>
>
>
>
>
>
> *From:* Jeff Zhang [mailto:zjf...@gmail.com]
> *Sent:* Wednesday, August 24, 2016 4:45 AM
> *To:* Sundararajan, Pranav
> *Cc:* dev@zeppelin.apache.org; d...@zeppelin.incubator.apache.org;
> us...@zeppelin.apache.org; Jaisankar, Saurabh; Patel, Dhaval;
> Nagasravanthi, Valluri
>
> *Subject:* Re: Issues in Zeppelin 0.6.0
>
>
>
> You are using sqlContext to run sql, so it should not relate with hive
> interpreter. It should run under spark interpreter.  I can ran your sql
> successfully with zeppelin 0.6 & spark 1.6. Could you restart your zeppelin
> and try again ? If possible, please attach the interpreter log
>
>
>
> On Wed, Aug 24, 2016 at 4:19 PM, Sundararajan, Pranav <
> pranav.sundarara...@pfizer.com> wrote:
>
> Hi Jeff,
>
>
>
> Thanks for your suggestion, but I believe we have worded our syntax
> exactly that same way, with the ‘Table’ keyword present, we merely missed
> out on capturing it on the mail.
>
>
>
> Coming back to the problem, we are fairly certain this issue has flared up
> due to the Zeppelin upgrade which we performed, as in the previous version
> of Zeppelin which we were using (0.5.6) the very same syntax for create and
> drop statements were working, it is only after the upgrade it is returning
> the error.
>
>
>
>
>
> We have checked the hive interpreter settings multiple times and also
> tried re-configuring it, and are able to successfully call it too. It is
> only with reference to trying to establish a connection between the Spark
> tables created in that sparkcontext instance and previously
> existing/created hive tables we are facing an issue. If you can give us
> come pointers or better yet remedies to this particular problem that’d be
> extremely helpful.
>
>
>
> Thanks a lot for your help!
>
>
>
> Regards,
>
>
>
>
>
>
>
> [image: Description: cid:image001.png@01D1C173.60CDE0C0]
>
> *Pranav Sundararajan| On assignment to **P**fizerWorks*
>
> Cell*: +91 9008412366 <%2B91%209008412366>*
>
> *Email**: **pfizerwor...@pfizer.com <pfizerwor...@pfizer.com>**; 
> **pranav.sundarara...@pfizer.com
> <pranav.sundarara...@pfizer.com>*
>
> *Website**: **http://pfizerWorks.pfizer.com
> <http://pfizerworks.pfizer.com/>*
>
>
>
>
>
>
>
> *From:* Jeff Zhang [mailto:zjf...@gmail.com]
> *Sent:* Wednesday, August 24, 2016 4:01 AM
> *To:* Nagasravanthi, Valluri
> *Cc:* dev@zeppelin.apache.org; d...@zeppelin.incubator.apache.org;
> us...@zeppelin.apache.org; Jaisankar, Saurabh; Sundararajan, Pranav;
> Patel, Dhaval
>
>
> *Subject:* Re: Issues in Zeppelin 0.6.0
>
>
>
> I think it is your sql syntax issue. You miss the keyword "table"
>
>
>
> It should be
>
> create *table* tablename as select * from tablename_1
>
> drop *table* if exists tablename
>
>
>
> On Mon, Aug 22, 2016 at 8:10 PM, Nagasravanthi, Valluri <
> valluri.nagasravan...@pfizer.com> wrote:
>
> Hi Jeff,
>
>
>
> We are still trying to sort out the issue that we highlighted before, so
> it’d be great if you could give your insight as soon as possible so that we
> can try and tackle it in a suitable manner.
>
>
>
> Thanks and Regards,
>
> *…………………………………………………………………………………………………………………………………………………………………………*
>
> *[image: Description: cid:image001.png@01D1EBF4.36D373B0]*
>
> *Valluri Naga Sravanthi| On assignment to **P**fizerWorks*
>
> Cell*: +91 9008412366 <%2B91%209008412366>*
>
> *Email**: **pfizerwor...@pfizer.com* <pfizerwor...@pfizer.com>*; *
> *valluri.nagasravan...@pfizer.com* <valluri.nagasravan...@pfizer.com%7C>
>
> *Website**: **http://pfizerWorks.pfizer.com*
> <http://pfizerworks.pfizer.com/>
>
>
>
> *…………………………………………………………………………………………………………………………………………………………*
>
> *From:* Nagasravanthi, Valluri
> *Sent:* Friday, August 19, 2016 8:39 AM
> *To:* 'Jeff Zhang'
> *Cc:* dev@zeppelin.apache.org; d...@zeppelin.incubator.apache.org;
> us...@zeppelin.apache.org; Jaisankar, Saurabh; Sundararajan, Pranav
> *Subject:* RE: Issues in Zeppelin 0.6.0
>
>
>
> We upgraded Ambari Cluster  while we were working with Zeppelin 0.5.6. At
> that point, we were using Spark 1.5.2. After the upgrade of Ambari, the
> Spark version got upgraded to 1.6. Post that we upgraded Zeppelin to 0.6.0.
>
>
>
> *From:* Jeff Zhang [mailto:zjf...@gmail.com]
> *Sent:* Friday, August 19, 2016 7:48 AM
> *To:* Nagasravanthi, Valluri
> *Cc:* dev@zeppelin.apache.org; d...@zeppelin.incubator.apache.org;
> us...@zeppelin.apache.org; Jaisankar, Saurabh; Sundararajan, Pranav
>
>
> *Subject:* Re: Issues in Zeppelin 0.6.0
>
>
>
> Do you change spark version when upgrading zeppelin ?
>
>
>
> On Fri, Aug 19, 2016 at 7:36 PM, Nagasravanthi, Valluri <
> valluri.nagasravan...@pfizer.com> wrote:
>
>
>
> Hi,
>
>
>
> Actually, when we worked on Zeppelin 0.5.6, we used to use sqlContext
> itself for reading hive files, registering spark dataframes as temp tables
> and eventualy processing queries on the temp tables.
>
>
>
> The issue which I am facing on Zeppelin 0.6.0 right now  is that even
> though I am able to execute “SELECT”  query on the temptables but *I am
> not able to execute DDL statements like Create/Drop tables using temptables
> derived from the hive files.  *
>
>
>
> *To elaborate it further:*
>
> When I write the following codes, they work fine. PFB the sample codes:
>
>    - sqlContext.sql(“select * from table name”)
>    - sqlContext.sql(“Select count(column_name) from table_name”)
>
> But when I write the following codes, it fails to execute:
>
>    - sqlContext.sql(“create tablename as select * from tablename_1”)
>    - sqlContext.sql(“drop if exists tablename”)
>
>
>
> I am getting the error : *java.lang.RuntimeException: [1.1] failure:
> ``with'' expected but identifier drop found :*
>
> I even tried using sql interpreter.
>
>
>
> *PFB the code for it:*
>
> %sql
>
> Drop if exists tablename
>
> I am getting the same error again: : *java.lang.RuntimeException: [1.1]
> failure: ``with'' expected but identifier drop found :*
>
>
>
> Thanks a lot for being so responsive. Hope to fix this issue soon with
> your help.
>
>
>
> Thanks and Regards,
>
> *…………………………………………………………………………………………………………………………………………………………………………*
>
> *[image: Description: cid:image001.png@01D1EBF4.36D373B0]*
>
> *Valluri Naga Sravanthi| On assignment to **P**fizerWorks*
>
> Cell*: +91 9008412366 <%2B91%209008412366>*
>
> *Email**: **pfizerwor...@pfizer.com* <pfizerwor...@pfizer.com>*; *
> *valluri.nagasravan...@pfizer.com* <valluri.nagasravan...@pfizer.com%7C>
>
> *Website**: **http://pfizerWorks.pfizer.com*
> <http://pfizerworks.pfizer.com/>
>
>
>
> *…………………………………………………………………………………………………………………………………………………………*
>
>
>
>
>
> *From:* Jeff Zhang [mailto:zjf...@gmail.com]
>
>
>
>
> *Sent:* Friday, August 19, 2016 6:49 AM
> *To:* Sundararajan, Pranav
> *Cc:* dev@zeppelin.apache.org; d...@zeppelin.incubator.apache.org;
> us...@zeppelin.apache.org; Jaisankar, Saurabh; Nagasravanthi, Valluri
>
>
> *Subject:* Re: Issues in Zeppelin 0.6.0
>
>
>
> Sorry, you should use sqlContext
>
>
>
> On Fri, Aug 19, 2016 at 6:11 PM, Sundararajan, Pranav <
> pranav.sundarara...@pfizer.com> wrote:
>
> Hi,
>
>
>
> We tried “hiveContext” instead of “HiveContext” but it didn’t work out.
>
> Please find below the code and the error log:
>
>
>
> *import org.apache.spark.sql.hive.HiveContext*
>
> *val df_test_hive = hiveContext.read.parquet("/location/hivefile")*
>
>
>
> *import org.apache.spark.sql.hive.HiveContext*
>
> *<console>:67: error: not found: value hiveContext*
>
> *val df_test_hive = hiveContext.read.parquet("/location/hivefile")*
>
>
>
> It seems hiveContext is not an object of HiveContext package.
>
> Could you please suggest if there is any other way out?
>
>
>
>
>
> Regards,
>
>
>
>
>
>
>
> [image: Description: cid:image001.png@01D1C173.60CDE0C0]
>
> *Pranav Sundararajan| On assignment to **P**fizerWorks*
>
> Cell*: +91 9008412366 <%2B91%209008412366>*
>
> *Email**: **pfizerwor...@pfizer.com <pfizerwor...@pfizer.com>**; 
> **pranav.sundarara...@pfizer.com
> <pranav.sundarara...@pfizer.com>*
>
> *Website**: **http://pfizerWorks.pfizer.com
> <http://pfizerworks.pfizer.com/>*
>
>
>
>
>
>
>
> *From:* Jeff Zhang [mailto:zjf...@gmail.com]
> *Sent:* Friday, August 19, 2016 3:35 AM
> *To:* Sundararajan, Pranav
> *Cc:* dev@zeppelin.apache.org; d...@zeppelin.incubator.apache.org;
> us...@zeppelin.apache.org; Jaisankar, Saurabh
>
>
> *Subject:* Re: Issues in Zeppelin 0.6.0
>
>
>
> It should be "hiveContext" instead of "HiveContext"
>
>
>
> On Fri, Aug 19, 2016 at 3:33 PM, Sundararajan, Pranav <
> pranav.sundarara...@pfizer.com> wrote:
>
> Hi,
>
>
>
> PFB the code.
>
> import org.apache.spark.sql.hive.HiveContext
>
> val df_test_hive = HiveContext.read.parquet("/location/hivefile ")
>
>
>
>
>
> We are getting the following error in the log:
>
> import org.apache.spark.sql.hive.HiveContext
>
> <console>:57: error: object HiveContext in package hive cannot be accessed
> in package org.apache.spark.sql.hive
>
> val df_test_hive = HiveContext.read.parquet("/location/hivefile”)
>
>
>
>
>
> PFB the embedded image of hive interpreter settings:
>
>
>
>
>
>
>
> Regards,
>
>
>
>
>
>
>
> [image: Description: cid:image001.png@01D1C173.60CDE0C0]
>
> *Pranav Sundararajan| On assignment to **P**fizerWorks*
>
> Cell*: +91 9008412366 <%2B91%209008412366>*
>
> *Email**: **pfizerwor...@pfizer.com <pfizerwor...@pfizer.com>**; 
> **pranav.sundarara...@pfizer.com
> <pranav.sundarara...@pfizer.com>*
>
> *Website**: **http://pfizerWorks.pfizer.com
> <http://pfizerworks.pfizer.com/>*
>
>
>
>
>
>
>
> *From:* Jeff Zhang [mailto:zjf...@gmail.com]
> *Sent:* Thursday, August 18, 2016 8:31 PM
> *To:* dev@zeppelin.apache.org
> *Cc:* d...@zeppelin.incubator.apache.org; us...@zeppelin.apache.org;
> Sundararajan, Pranav; Jaisankar, Saurabh
> *Subject:* Re: Issues in Zeppelin 0.6.0
>
>
>
> Hi
>
>
>
> Since you have many issues, let's focus one issue first.
>
>
>
> >>> *not able to use the HiveContext to read the Hive table*
>
>
>
>
>
> Can you paste your code of how you use HiveContext ? Do you create it by
> yourself ? It should be created by zeppelin, so you don't need to create it.
>
> What's  in interpreter log ?
>
>
>
>
>
>
>
> On Thu, Aug 18, 2016 at 7:35 PM, Nagasravanthi, Valluri <
> valluri.nagasravan...@pfizer.com> wrote:
>
> Hi,
>
>
>
> I am using Zeppelin 0.6.0. Please find below the issues along with their
> detailed explanation.
>
>
>
> *Zeppelin 0.6.0 Issues:*
>
> a.       *not able to execute DDL statements like Create/Drop tables
> using temptables derived from the hive table*
>
> ·         Error Log:* “*java.lang.RuntimeException: [1.1] failure:
> ``with'' expected but identifier drop found : When using sql interpreter to
> drop*”*
>
>
>
> b.      *not able to use the HiveContext to read the Hive table*
>
> ·         Error Log: *“*error: object HiveContext in package hive cannot
> be accessed in package org.apache.spark.sql.hive*”*
>
>
>
> *Detailed Explanation:*
>
> I upgraded to 0.6.0 from Zeppelin 0.5.6 last week. I am facing some issues
> while using notebooks on 0.6.0. I am using Ambari 2.4.2 as my Cluster
> Manager and Spark version is 1.6.
>
>
>
> The workflow of notebook is as follows:
>
> 1.       Create a spark scala dataframe by reading a hive table in
> parquet/text format using sqlContext (sqlContext.read.parquet(“/
> tablelocation/tablename”)
>
> 2.       Import sqlcontext_implicits
>
> 3.       Register the dataframe as a temp table
>
> 4.       Write queries using %sql interpreter or sqlContext.sql
>
>
>
> The issue which I am facing right now is that Even though I am able to
> execute “SELECT”  query on the temptables but *I am not able to execute
> DDL statements like Create/Drop tables using temptables derived from the
> hive table.  *
>
> Following is my code:
>
> 1st case:  sqlContext.sql(“drop if exists tablename”)
>
> 2nd case: %sql
>
>                   drop if exists tablename
>
>
>
> I am getting the same error for both the cases:
> java.lang.RuntimeException: [1.1] failure: ``with'' expected but identifier
> drop found : When using sql interpreter to drop
>
>
>
> It is to be noted that, the same code used to work in Zeppelin 0.5.6.
>
>
>
> After researching a bit, I came across that I need to use HiveContext to
> query hive table.
>
>
>
> The second issue which I am facing is I was able to import HiveContext
> using “import org.apache.spark.sql.hive.HiveContext” *but I was not able
> to use the HiveContext to read the Hive table.*
>
>
>
> This is the code which I wrote :
>
> (HiveContext.read.parquet(“/tablelocation/tablename”)
>
>
>
> I got the following error:
>
> error: object HiveContext in package hive cannot be accessed in package
> org.apache.spark.sql.hive
>
>
>
> I am not able to deep dive into this error as there is not much support
> online.
>
>
>
> Could anyone please suggest any fix for the errors ?
>
>
>
> Thanks and Regards,
>
> *…………………………………………………………………………………………………………………………………………………………………………*
>
> *Valluri Naga Sravanthi| On assignment to **P**fizerWorks*
>
> Cell*: +91 9008412366 <%2B91%209008412366>*
>
> *Email**: **pfizerwor...@pfizer.com* <pfizerwor...@pfizer.com>*; *
> *valluri.nagasravan...@pfizer.com* <valluri.nagasravan...@pfizer.com%7C>
>
> *Website**: **http://pfizerWorks.pfizer.com*
> <http://pfizerworks.pfizer.com/>
>
>
>
> *…………………………………………………………………………………………………………………………………………………………*
>
>
>
>
>
>
>
> --
>
> Best Regards
>
> Jeff Zhang
>
>
>
>
>
> --
>
> Best Regards
>
> Jeff Zhang
>
>
>
>
>
> --
>
> Best Regards
>
> Jeff Zhang
>
>
>
>
>
> --
>
> Best Regards
>
> Jeff Zhang
>
>
>
>
>
> --
>
> Best Regards
>
> Jeff Zhang
>
>
>
>
>
> --
>
> Best Regards
>
> Jeff Zhang
>



-- 
Best Regards

Jeff Zhang

Re: Issues in Zeppelin 0.6.0

Reply via email to