Sun Rui created SPARK-16055:
-------------------------------
Summary: sparkR.init() can not load sparkPackages when executing
an R file
Key: SPARK-16055
URL: https://issues.apache.org/jira/browse/SPARK-16055
Project: Spark
Issue Type: Brainstorming
Components: SparkR
Affects Versions: 1.6.1
Reporter: Sun Rui
Priority: Minor
This is an issue reported in the Spark user mailing list. Refer to
http://comments.gmane.org/gmane.comp.lang.scala.spark.user/35742
This issue does not occur in an interactive SparkR session, while it does occur
when executing an R file.
The following example code can be put into an R file to reproduce this issue:
{code}
.libPaths(c("/home/user/spark-1.6.1-bin-hadoop2.6/R/lib",.libPaths()))
Sys.setenv(SPARK_HOME="/home/user/spark-1.6.1-bin-hadoop2.6")
library("SparkR")
sc <- sparkR.init(sparkPackages = "com.databricks:spark-csv_2.11:1.4.0")
sqlContext <- sparkRSQL.init(sc)
df <- read.df(sqlContext,
"file:///home/user/spark-1.6.1-bin-hadoop2.6/data/mllib/sample_tree_data.csv","csv")
showDF(df)
{code}
The reason behind this is that in case you execute an R file, the R backend
launches before the R interpreter, so there is no opportunity for packages
specified with ‘sparkPackages’ to be processed.
This JIRA issue is to track this issue. An appropriate solution is to be
discussed. Maybe documentation the limitation.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]