For anyone monitoring the thread, I was able to successfully install and run a small Spark cluster and model using this method:
First, make sure that the username being used to login to RStudio Server is the one that was used to install Spark on the EC2 instance. Thanks to Shivaram for his help here. Login to RStudio and ensure that these references are used - set the library location to the folder where spark is installed. In my case, ~/home/rstudio/spark. # # This line loads SparkR (the R package) from the installed directory library("SparkR", lib.loc="./spark/R/lib") The edits to this line were important, so that Spark knew where the install folder was located when initializing the cluster. # Initialize the Spark local cluster in R, as ‘sc’ sc <- sparkR.init("local[2]", "SparkR", "./spark") >From here, we ran a basic model using Spark, from RStudio, which ran successfully. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-1-4-0-Using-SparkR-on-EC2-Instance-tp23506p23514.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org