Chip Senkbeil created SPARK-13573:
-------------------------------------
Summary: Open SparkR APIs (R package) to allow better 3rd party
usage
Key: SPARK-13573
URL: https://issues.apache.org/jira/browse/SPARK-13573
Project: Spark
Issue Type: Improvement
Components: SparkR
Reporter: Chip Senkbeil
Currently, SparkR's R package does not expose enough of its APIs to be used
flexibly. That I am aware of, SparkR still requires you to create a new
SparkContext by invoking the sparkR.init method (so you cannot connect to a
running one) and there is no way to invoke custom Java methods using the
exposed SparkR API (unlike PySpark).
We currently maintain a fork of SparkR that is used to power the R
implementation of Apache Toree, which is a gateway to use Apache Spark. This
fork provides a connect method (to use an existing Spark Context), exposes
needed methods like invokeJava (to be able to communicate with our JVM to
retrieve code to run, etc), and uses reflection to access
org.apache.spark.api.r.RBackend.
Here is the documentation I recorded regarding changes we need to enable SparkR
as an option for Apache Toree:
https://github.com/apache/incubator-toree/tree/master/sparkr-interpreter/src/main/resources
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]