[ 
https://issues.apache.org/jira/browse/SPARK-8596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604053#comment-14604053
 ] 

Vincent Warmerdam commented on SPARK-8596:
------------------------------------------

I'm writing a small tutorial to get up to scratch with rstudio on AWS. It 
works. The main issue seems that currently ec2 installs an old version of R 
(3.1) while most packages like ggplot require a new version (3.2). I'm going to 
share the tutorial with the Rstudio guys soon. 

My approach is to run `spark/bin/start-all.sh` on the master node and then run 
the following commands in Rstudio on the master node: 

.libPaths( c( .libPaths(), '/root/spark/R/lib') )
Sys.setenv(SPARK_HOME = '/root/spark')
Sys.setenv(PATH = paste(Sys.getenv(c("PATH")), '/root/spark/bin', sep=':'))
library(SparkR)
sc <- sparkR.init('<SPARK MASTER ADR>')
sqlContext <- sparkRSQL.init(sc)

This works on my end, and I've been able to use the dataframe API with a json 
blob on s3 with this sqlContext. 

> Install and configure RStudio server on Spark EC2
> -------------------------------------------------
>
>                 Key: SPARK-8596
>                 URL: https://issues.apache.org/jira/browse/SPARK-8596
>             Project: Spark
>          Issue Type: Improvement
>          Components: EC2, SparkR
>            Reporter: Shivaram Venkataraman
>
> This will make it convenient for R users to use SparkR from their browsers 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to