[jira] [Comment Edited] (SPARK-16578) Configurable hostname for RBackend
[ https://issues.apache.org/jira/browse/SPARK-16578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15431495#comment-15431495 ] Felix Cheung edited comment on SPARK-16578 at 8/22/16 7:52 PM: --- +1 on this. Some discussions and context/data-points on RBackend API or connect to remote JVM in SPARK-16581. was (Author: felixcheung): +1 on this. Some discussions on RBackend API or connect to remote JVM in SPARK-16581. > Configurable hostname for RBackend > -- > > Key: SPARK-16578 > URL: https://issues.apache.org/jira/browse/SPARK-16578 > Project: Spark > Issue Type: Sub-task > Components: SparkR >Reporter: Shivaram Venkataraman >Assignee: Junyang Qian > > One of the requirements that comes up with SparkR being a standalone package > is that users can now install just the R package on the client side and > connect to a remote machine which runs the RBackend class. > We should check if we can support this mode of execution and what are the > pros / cons of it -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-16578) Configurable hostname for RBackend
[ https://issues.apache.org/jira/browse/SPARK-16578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15429942#comment-15429942 ] Xiangrui Meng edited comment on SPARK-16578 at 8/22/16 12:47 AM: - [~shivaram] I had an offline discussion with [~junyangq] and I feel that we might have some misunderstanding of user scenarios. The old workflow for SparkR is the following: 1. Users download and install Spark distribution by themselves. 2. Users let R know where to find the SparkR package on local. 3. `library(SparkR)` 4. Launch driver/SparkContext (in client mode) and connect to a local or remote cluster. And the ideal workflow is the following: 1. install.packages("SparkR") from CRAN and then `library(SparkR)` 2. optionally `install.spark` 3. Launch driver/SparkContext (in client mode) and connect to a local or remote cluster. So the way we run spark-submit, RBackend, and R process, and create the SparkContext doesn't really change. They are still running on the same machine (e.g., user's laptop). So it is not necessary to make RBackend running remotely for this scenario. Having RBackend running remotely is a new Spark deployment mode and I think it requires more design and discussions. was (Author: mengxr): [~shivaram] I had an offline discussion with [~junyangq] and I feel that we might have some misunderstanding of user scenarios. The old workflow for SparkR is the following: 1. Users download and install Spark distribution by themselves. 2. Users let R know where to find the SparkR package on local. 3. `library(SparkR)` 4. Launch driver/SparkContext (in client mode) and connect to a local or remote cluster. And the ideal workflow is the following: 1. install.packages("SparkR") from CRAN 2. optionally `install.spark` 3. Launch driver/SparkContext (in client mode) and connect to a local or remote cluster. So the way we run spark-submit, RBackend, and R process, and create the SparkContext doesn't really change. They are still running on the same machine (e.g., user's laptop). So it is not necessary to make RBackend running remotely for this scenario. Having RBackend running remotely is a new Spark deployment mode and I think it requires more design and discussions. > Configurable hostname for RBackend > -- > > Key: SPARK-16578 > URL: https://issues.apache.org/jira/browse/SPARK-16578 > Project: Spark > Issue Type: Sub-task > Components: SparkR >Reporter: Shivaram Venkataraman >Assignee: Junyang Qian > > One of the requirements that comes up with SparkR being a standalone package > is that users can now install just the R package on the client side and > connect to a remote machine which runs the RBackend class. > We should check if we can support this mode of execution and what are the > pros / cons of it -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-16578) Configurable hostname for RBackend
[ https://issues.apache.org/jira/browse/SPARK-16578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15429942#comment-15429942 ] Xiangrui Meng edited comment on SPARK-16578 at 8/22/16 12:46 AM: - [~shivaram] I had an offline discussion with [~junyangq] and I feel that we might have some misunderstanding of user scenarios. The old workflow for SparkR is the following: 1. Users download and install Spark distribution by themselves. 2. Users let R know where to find the SparkR package on local. 3. `library(SparkR)` 4. Launch driver/SparkContext (in client mode) and connect to a local or remote cluster. And the ideal workflow is the following: 1. install.packages("SparkR") from CRAN 2. optionally `install.spark` 3. Launch driver/SparkContext (in client mode) and connect to a local or remote cluster. So the way we run spark-submit, RBackend, and R process, and create the SparkContext doesn't really change. They are still running on the same machine (e.g., user's laptop). So it is not necessary to make RBackend running remotely for this scenario. Having RBackend running remotely is a new Spark deployment mode and I think it requires more design and discussions. was (Author: mengxr): [~shivaram] I had an offline discussion with [~junyangq] and I feel that we might have some misunderstanding of user scenarios. The old workflow for SparkR is the following: 1. Users download and install Spark distribution by themselves. 2. Users let R know where to find the SparkR package on local. 3. `library(SparkR)` 4. Launch driver/SparkContext (in client mode) and connect to a local or remote cluster. And the ideal workflow is the following: 1. install.packages("SparkR") 2. optionally `install.spark` 3. Launch driver/SparkContext (in client mode) and connect to a local or remote cluster. So the way we run spark-submit, RBackend, and R process, and create the SparkContext doesn't really change. They are still running on the same machine (e.g., user's laptop). So it is not necessary to make RBackend running remotely for this scenario. Having RBackend running remotely is a new Spark deployment mode and I think it requires more design and discussions. > Configurable hostname for RBackend > -- > > Key: SPARK-16578 > URL: https://issues.apache.org/jira/browse/SPARK-16578 > Project: Spark > Issue Type: Sub-task > Components: SparkR >Reporter: Shivaram Venkataraman >Assignee: Junyang Qian > > One of the requirements that comes up with SparkR being a standalone package > is that users can now install just the R package on the client side and > connect to a remote machine which runs the RBackend class. > We should check if we can support this mode of execution and what are the > pros / cons of it -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org