Chip Senkbeil created SPARK-13573:
-------------------------------------

             Summary: Open SparkR APIs (R package) to allow better 3rd party 
usage
                 Key: SPARK-13573
                 URL: https://issues.apache.org/jira/browse/SPARK-13573
             Project: Spark
          Issue Type: Improvement
          Components: SparkR
            Reporter: Chip Senkbeil


Currently, SparkR's R package does not expose enough of its APIs to be used 
flexibly. That I am aware of, SparkR still requires you to create a new 
SparkContext by invoking the sparkR.init method (so you cannot connect to a 
running one) and there is no way to invoke custom Java methods using the 
exposed SparkR API (unlike PySpark).

We currently maintain a fork of SparkR that is used to power the R 
implementation of Apache Toree, which is a gateway to use Apache Spark. This 
fork provides a connect method (to use an existing Spark Context), exposes 
needed methods like invokeJava (to be able to communicate with our JVM to 
retrieve code to run, etc), and uses reflection to access 
org.apache.spark.api.r.RBackend.

Here is the documentation I recorded regarding changes we need to enable SparkR 
as an option for Apache Toree: 
https://github.com/apache/incubator-toree/tree/master/sparkr-interpreter/src/main/resources



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to