[ 
https://issues.apache.org/jira/browse/SPARK-15799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15326146#comment-15326146
 ] 

Sun Rui commented on SPARK-15799:
---------------------------------

This request has been asked before. The question is that SparkR needs co-work 
with the Spark distribution with the matching version. I think releasing SparkR 
on CRAN will promote the adoption of it. So we need find a release model for 
it. My thought is as follows:
1. Release SparkR R portion as SparkR package on CRAN following the normal R 
package convention. The package contains the matching Spark version and link 
for the spark distribution. The package has .onLoad() function. When it is 
loaded, .onLoad() will check if there is a local spark distribution installed. 
If not, it will attempt to download the distribution from the link and saving 
into a proper location. The SparkR CRAN package depends on the Spark 
distribution for the RBackend, for local mode execution and for remote cluster 
connection. .onLoad() will set SPARK_HOME if if finds the spark distribution.
2. Add a version check mechanism. So SparkR can check it matches the remote 
cluster if remote cluster deploying mode is desired.
3. R users don't need special scripts like bin/sparkR or bin/spark-submit for 
using SparkR. They can just start R, load SparkR library(). or running a SparkR 
script from the command line. In SparkR.init(), version check is performed and 
if no match, error message will be displayed.


> Release SparkR on CRAN
> ----------------------
>
>                 Key: SPARK-15799
>                 URL: https://issues.apache.org/jira/browse/SPARK-15799
>             Project: Spark
>          Issue Type: New Feature
>          Components: SparkR
>            Reporter: Xiangrui Meng
>
> Story: "As an R user, I would like to see SparkR released on CRAN, so I can 
> use SparkR easily in an existing R environment and have other packages built 
> on top of SparkR."
> I made this JIRA with the following questions in mind:
> * Are there known issues that prevent us releasing SparkR on CRAN?
> * Do we want to package Spark jars in the SparkR release?
> * Are there license issues?
> * How does it fit into Spark's release process?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to