[
https://issues.apache.org/jira/browse/SPARK-15799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15326146#comment-15326146
]
Sun Rui commented on SPARK-15799:
---------------------------------
This request has been asked before. The question is that SparkR needs co-work
with the Spark distribution with the matching version. I think releasing SparkR
on CRAN will promote the adoption of it. So we need find a release model for
it. My thought is as follows:
1. Release SparkR R portion as SparkR package on CRAN following the normal R
package convention. The package contains the matching Spark version and link
for the spark distribution. The package has .onLoad() function. When it is
loaded, .onLoad() will check if there is a local spark distribution installed.
If not, it will attempt to download the distribution from the link and saving
into a proper location. The SparkR CRAN package depends on the Spark
distribution for the RBackend, for local mode execution and for remote cluster
connection. .onLoad() will set SPARK_HOME if if finds the spark distribution.
2. Add a version check mechanism. So SparkR can check it matches the remote
cluster if remote cluster deploying mode is desired.
3. R users don't need special scripts like bin/sparkR or bin/spark-submit for
using SparkR. They can just start R, load SparkR library(). or running a SparkR
script from the command line. In SparkR.init(), version check is performed and
if no match, error message will be displayed.
> Release SparkR on CRAN
> ----------------------
>
> Key: SPARK-15799
> URL: https://issues.apache.org/jira/browse/SPARK-15799
> Project: Spark
> Issue Type: New Feature
> Components: SparkR
> Reporter: Xiangrui Meng
>
> Story: "As an R user, I would like to see SparkR released on CRAN, so I can
> use SparkR easily in an existing R environment and have other packages built
> on top of SparkR."
> I made this JIRA with the following questions in mind:
> * Are there known issues that prevent us releasing SparkR on CRAN?
> * Do we want to package Spark jars in the SparkR release?
> * Are there license issues?
> * How does it fit into Spark's release process?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]