I call stop from console as R studio warns and advises it. And yes. after stop was called the whole script was run again together. It means init "hivecontext <- sparkRHive.init(sc)" is called after stop always.
On Tue, Jan 12, 2016 at 8:31 PM, Felix Cheung <felixcheun...@hotmail.com> wrote: > As you can see from my reply below from Jan 6, calling sparkR.stop() > invalidates both sc and hivecontext you have and results in this invalid > jobj error. > > If you start R and run this, it should work: > > Sys.setenv(SPARK_HOME="/usr/hdp/current/spark-client") > > .libPaths(c(file.path(Sys.getenv("SPARK_HOME"), "R", "lib"), .libPaths())) > library(SparkR) > > sc <- sparkR.init() > hivecontext <- sparkRHive.init(sc) > df <- loadDF(hivecontext, "/data/ingest/sparktest1/", "orc") > > > Is there a reason you want to call stop? If you do, you would need to call > the line hivecontext <- sparkRHive.init(sc) again. > > > _____________________________ > From: Sandeep Khurana <sand...@infoworks.io> > Sent: Tuesday, January 12, 2016 5:20 AM > Subject: Re: sparkR ORC support. > To: Felix Cheung <felixcheun...@hotmail.com> > Cc: spark users <user@spark.apache.org>, Prem Sure <premsure...@gmail.com>, > Deepak Sharma <deepakmc...@gmail.com>, Yanbo Liang <yblia...@gmail.com> > > > It worked for sometime. Then I did sparkR.stop() an re-ran again to get > the same error. Any idea why it ran fine before ( while running fine it > kept giving warning reusing existing spark-context and that I should > restart) ? There is one more R code which instantiated spark , I ran that > too again. > > > On Tue, Jan 12, 2016 at 3:05 PM, Sandeep Khurana <sand...@infoworks.io> > wrote: > >> Complete stacktrace is. Can it be something wih java versions? >> >> >> stop("invalid jobj ", value$id) >> 8 >> writeJobj(con, object) >> 7 >> writeObject(con, a) >> 6 >> writeArgs(rc, args) >> 5 >> invokeJava(isStatic = TRUE, className, methodName, ...) >> 4 >> callJStatic("org.apache.spark.sql.api.r.SQLUtils", "loadDF", sqlContext, >> source, options) >> 3 >> read.df(sqlContext, path, source, schema, ...) >> 2 >> loadDF(hivecontext, filepath, "orc") >> >> On Tue, Jan 12, 2016 at 2:41 PM, Sandeep Khurana <sand...@infoworks.io> >> wrote: >> >>> Running this gave >>> >>> 16/01/12 04:06:54 INFO BlockManagerMaster: Registered BlockManagerError in >>> writeJobj(con, object) : invalid jobj 3 >>> >>> >>> How does it know which hive schema to connect to? >>> >>> >>> >>> On Tue, Jan 12, 2016 at 2:34 PM, Felix Cheung <felixcheun...@hotmail.com >>> > wrote: >>> >>>> It looks like you have overwritten sc. Could you try this: >>>> >>>> >>>> Sys.setenv(SPARK_HOME="/usr/hdp/current/spark-client") >>>> >>>> .libPaths(c(file.path(Sys.getenv("SPARK_HOME"), "R", "lib"), >>>> .libPaths())) >>>> library(SparkR) >>>> >>>> sc <- sparkR.init() >>>> hivecontext <- sparkRHive.init(sc) >>>> df <- loadDF(hivecontext, "/data/ingest/sparktest1/", "orc") >>>> >>>> >>>> >>>> ------------------------------ >>>> Date: Tue, 12 Jan 2016 14:28:58 +0530 >>>> Subject: Re: sparkR ORC support. >>>> From: sand...@infoworks.io >>>> To: felixcheun...@hotmail.com >>>> CC: yblia...@gmail.com; user@spark.apache.org; premsure...@gmail.com; >>>> deepakmc...@gmail.com >>>> >>>> >>>> The code is very simple, pasted below . >>>> hive-site.xml is in spark conf already. I still see this error >>>> >>>> Error in writeJobj(con, object) : invalid jobj 3 >>>> >>>> after running the script below >>>> >>>> >>>> script >>>> ======= >>>> Sys.setenv(SPARK_HOME="/usr/hdp/current/spark-client") >>>> >>>> >>>> .libPaths(c(file.path(Sys.getenv("SPARK_HOME"), "R", "lib"), >>>> .libPaths())) >>>> library(SparkR) >>>> >>>> sc <<- sparkR.init() >>>> sc <<- sparkRHive.init() >>>> hivecontext <<- sparkRHive.init(sc) >>>> df <- loadDF(hivecontext, "/data/ingest/sparktest1/", "orc") >>>> #View(df) >>>> >>>> >>>> On Wed, Jan 6, 2016 at 11:08 PM, Felix Cheung < >>>> felixcheun...@hotmail.com> wrote: >>>> >>>> Yes, as Yanbo suggested, it looks like there is something wrong with >>>> the sqlContext. >>>> >>>> Could you forward us your code please? >>>> >>>> >>>> >>>> >>>> >>>> On Wed, Jan 6, 2016 at 5:52 AM -0800, "Yanbo Liang" <yblia...@gmail.com >>>> > wrote: >>>> >>>> You should ensure your sqlContext is HiveContext. >>>> >>>> sc <- sparkR.init() >>>> >>>> sqlContext <- sparkRHive.init(sc) >>>> >>>> >>>> 2016-01-06 20:35 GMT+08:00 Sandeep Khurana <sand...@infoworks.io>: >>>> >>>> Felix >>>> >>>> I tried the option suggested by you. It gave below error. I am going >>>> to try the option suggested by Prem . >>>> >>>> Error in writeJobj(con, object) : invalid jobj 1 >>>> 8 >>>> stop("invalid jobj ", value$id) >>>> 7 >>>> writeJobj(con, object) >>>> 6 >>>> writeObject(con, a) >>>> 5 >>>> writeArgs(rc, args) >>>> 4 >>>> invokeJava(isStatic = TRUE, className, methodName, ...) >>>> 3 >>>> callJStatic("org.apache.spark.sql.api.r.SQLUtils", "loadDF", >>>> sqlContext, source, options) >>>> 2 >>>> read.df(sqlContext, filepath, "orc") at >>>> spark_api.R#108 >>>> >>>> On Wed, Jan 6, 2016 at 10:30 AM, Felix Cheung < >>>> felixcheun...@hotmail.com> wrote: >>>> >>>> Firstly I don't have ORC data to verify but this should work: >>>> >>>> df <- loadDF(sqlContext, "data/path", "orc") >>>> >>>> Secondly, could you check if sparkR.stop() was called? >>>> sparkRHive.init() should be called after sparkR.init() - please check if >>>> there is any error message there. >>>> >>>> _____________________________ >>>> From: Prem Sure < premsure...@gmail.com> >>>> Sent: Tuesday, January 5, 2016 8:12 AM >>>> Subject: Re: sparkR ORC support. >>>> To: Sandeep Khurana < sand...@infoworks.io> >>>> Cc: spark users < user@spark.apache.org>, Deepak Sharma < >>>> deepakmc...@gmail.com> >>>> >>>> >>>> >>>> Yes Sandeep, also copy hive-site.xml too to spark conf directory. >>>> >>>> >>>> On Tue, Jan 5, 2016 at 10:07 AM, Sandeep Khurana <sand...@infoworks.io> >>>> wrote: >>>> >>>> Also, do I need to setup hive in spark as per the link >>>> http://stackoverflow.com/questions/26360725/accesing-hive-tables-in-spark >>>> ? >>>> >>>> We might need to copy hdfs-site.xml file to spark conf directory ? >>>> >>>> On Tue, Jan 5, 2016 at 8:28 PM, Sandeep Khurana <sand...@infoworks.io> >>>> wrote: >>>> >>>> Deepak >>>> >>>> Tried this. Getting this error now >>>> >>>> rror in sql(hivecontext, "FROM CATEGORIES SELECT category_id", "") : >>>> unused argument ("") >>>> >>>> >>>> On Tue, Jan 5, 2016 at 6:48 PM, Deepak Sharma <deepakmc...@gmail.com> >>>> wrote: >>>> >>>> Hi Sandeep >>>> can you try this ? >>>> >>>> results <- sql(hivecontext, "FROM test SELECT id","") >>>> >>>> Thanks >>>> Deepak >>>> >>>> >>>> On Tue, Jan 5, 2016 at 5:49 PM, Sandeep Khurana <sand...@infoworks.io> >>>> wrote: >>>> >>>> Thanks Deepak. >>>> >>>> I tried this as well. I created a hivecontext with "hivecontext <<- >>>> sparkRHive.init(sc) " . >>>> >>>> When I tried to read hive table from this , >>>> >>>> results <- sql(hivecontext, "FROM test SELECT id") >>>> >>>> I get below error, >>>> >>>> Error in callJMethod(sqlContext, "sql", sqlQuery) : Invalid jobj 2. If >>>> SparkR was restarted, Spark operations need to be re-executed. >>>> >>>> >>>> Not sure what is causing this? Any leads or ideas? I am using rstudio. >>>> >>>> >>>> >>>> On Tue, Jan 5, 2016 at 5:35 PM, Deepak Sharma <deepakmc...@gmail.com> >>>> wrote: >>>> >>>> Hi Sandeep >>>> I am not sure if ORC can be read directly in R. >>>> But there can be a workaround .First create hive table on top of ORC >>>> files and then access hive table in R. >>>> >>>> Thanks >>>> Deepak >>>> >>>> On Tue, Jan 5, 2016 at 4:57 PM, Sandeep Khurana <sand...@infoworks.io> >>>> wrote: >>>> >>>> Hello >>>> >>>> I need to read an ORC files in hdfs in R using spark. I am not able to >>>> find a package to do that. >>>> >>>> Can anyone help with documentation or example for this purpose? >>>> >>>> -- >>>> Architect >>>> Infoworks.io <http://infoworks.io> >>>> http://Infoworks.io >>>> >>>> >>>> >>>> >>>> -- >>>> Thanks >>>> Deepak >>>> www.bigdatabig.com >>>> www.keosha.net >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> -- >>>> Thanks >>>> Deepak >>>> www.bigdatabig.com >>>> www.keosha.net >>>> >>>> >>>> >>>> >>>>