[jira] [Comment Edited] (SPARK-16578) Configurable hostname for RBackend

2016-08-22 Thread Felix Cheung (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-16578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15431495#comment-15431495
 ] 

Felix Cheung edited comment on SPARK-16578 at 8/22/16 7:52 PM:
---

+1 on this.
Some discussions and context/data-points on RBackend API or connect to remote 
JVM in SPARK-16581.



was (Author: felixcheung):
+1 on this.
Some discussions on RBackend API or connect to remote JVM in SPARK-16581.


> Configurable hostname for RBackend
> --
>
> Key: SPARK-16578
> URL: https://issues.apache.org/jira/browse/SPARK-16578
> Project: Spark
>  Issue Type: Sub-task
>  Components: SparkR
>Reporter: Shivaram Venkataraman
>Assignee: Junyang Qian
>
> One of the requirements that comes up with SparkR being a standalone package 
> is that users can now install just the R package on the client side and 
> connect to a remote machine which runs the RBackend class.
> We should check if we can support this mode of execution and what are the 
> pros / cons of it



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-16578) Configurable hostname for RBackend

2016-08-21 Thread Xiangrui Meng (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-16578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15429942#comment-15429942
 ] 

Xiangrui Meng edited comment on SPARK-16578 at 8/22/16 12:47 AM:
-

[~shivaram] I had an offline discussion with [~junyangq] and I feel that we 
might have some misunderstanding of user scenarios. 

The old workflow for SparkR is the following:

1. Users download and install Spark distribution by themselves.
2. Users let R know where to find the SparkR package on local.
3. `library(SparkR)`
4. Launch driver/SparkContext (in client mode) and connect to a local or remote 
cluster.

And the ideal workflow is the following:

1. install.packages("SparkR") from CRAN and then `library(SparkR)`
2. optionally `install.spark`
3. Launch driver/SparkContext (in client mode) and connect to a local or remote 
cluster.

So the way we run spark-submit, RBackend, and R process, and create the 
SparkContext doesn't really change. They are still running on the same machine 
(e.g., user's laptop). So it is not necessary to make RBackend running remotely 
for this scenario.

Having RBackend running remotely is a new Spark deployment mode and I think it 
requires more design and discussions.


was (Author: mengxr):
[~shivaram] I had an offline discussion with [~junyangq] and I feel that we 
might have some misunderstanding of user scenarios. 

The old workflow for SparkR is the following:

1. Users download and install Spark distribution by themselves.
2. Users let R know where to find the SparkR package on local.
3. `library(SparkR)`
4. Launch driver/SparkContext (in client mode) and connect to a local or remote 
cluster.

And the ideal workflow is the following:

1. install.packages("SparkR") from CRAN
2. optionally `install.spark`
3. Launch driver/SparkContext (in client mode) and connect to a local or remote 
cluster.

So the way we run spark-submit, RBackend, and R process, and create the 
SparkContext doesn't really change. They are still running on the same machine 
(e.g., user's laptop). So it is not necessary to make RBackend running remotely 
for this scenario.

Having RBackend running remotely is a new Spark deployment mode and I think it 
requires more design and discussions.

> Configurable hostname for RBackend
> --
>
> Key: SPARK-16578
> URL: https://issues.apache.org/jira/browse/SPARK-16578
> Project: Spark
>  Issue Type: Sub-task
>  Components: SparkR
>Reporter: Shivaram Venkataraman
>Assignee: Junyang Qian
>
> One of the requirements that comes up with SparkR being a standalone package 
> is that users can now install just the R package on the client side and 
> connect to a remote machine which runs the RBackend class.
> We should check if we can support this mode of execution and what are the 
> pros / cons of it



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-16578) Configurable hostname for RBackend

2016-08-21 Thread Xiangrui Meng (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-16578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15429942#comment-15429942
 ] 

Xiangrui Meng edited comment on SPARK-16578 at 8/22/16 12:46 AM:
-

[~shivaram] I had an offline discussion with [~junyangq] and I feel that we 
might have some misunderstanding of user scenarios. 

The old workflow for SparkR is the following:

1. Users download and install Spark distribution by themselves.
2. Users let R know where to find the SparkR package on local.
3. `library(SparkR)`
4. Launch driver/SparkContext (in client mode) and connect to a local or remote 
cluster.

And the ideal workflow is the following:

1. install.packages("SparkR") from CRAN
2. optionally `install.spark`
3. Launch driver/SparkContext (in client mode) and connect to a local or remote 
cluster.

So the way we run spark-submit, RBackend, and R process, and create the 
SparkContext doesn't really change. They are still running on the same machine 
(e.g., user's laptop). So it is not necessary to make RBackend running remotely 
for this scenario.

Having RBackend running remotely is a new Spark deployment mode and I think it 
requires more design and discussions.


was (Author: mengxr):
[~shivaram] I had an offline discussion with [~junyangq] and I feel that we 
might have some misunderstanding of user scenarios. 

The old workflow for SparkR is the following:

1. Users download and install Spark distribution by themselves.
2. Users let R know where to find the SparkR package on local.
3. `library(SparkR)`
4. Launch driver/SparkContext (in client mode) and connect to a local or remote 
cluster.

And the ideal workflow is the following:

1. install.packages("SparkR")
2. optionally `install.spark`
3. Launch driver/SparkContext (in client mode) and connect to a local or remote 
cluster.

So the way we run spark-submit, RBackend, and R process, and create the 
SparkContext doesn't really change. They are still running on the same machine 
(e.g., user's laptop). So it is not necessary to make RBackend running remotely 
for this scenario.

Having RBackend running remotely is a new Spark deployment mode and I think it 
requires more design and discussions.

> Configurable hostname for RBackend
> --
>
> Key: SPARK-16578
> URL: https://issues.apache.org/jira/browse/SPARK-16578
> Project: Spark
>  Issue Type: Sub-task
>  Components: SparkR
>Reporter: Shivaram Venkataraman
>Assignee: Junyang Qian
>
> One of the requirements that comes up with SparkR being a standalone package 
> is that users can now install just the R package on the client side and 
> connect to a remote machine which runs the RBackend class.
> We should check if we can support this mode of execution and what are the 
> pros / cons of it



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org