[jira] [Commented] (SPARK-10971) sparkR: RRunner should allow setting path to Rscript
[ https://issues.apache.org/jira/browse/SPARK-10971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14981862#comment-14981862 ] Apache Spark commented on SPARK-10971: -- User 'sun-rui' has created a pull request for this issue: https://github.com/apache/spark/pull/9368 > sparkR: RRunner should allow setting path to Rscript > > > Key: SPARK-10971 > URL: https://issues.apache.org/jira/browse/SPARK-10971 > Project: Spark > Issue Type: Bug > Components: SparkR >Affects Versions: 1.5.1 >Reporter: Thomas Graves >Assignee: Sun Rui > Fix For: 1.5.3, 1.6.0 > > > I'm running spark on yarn and trying to use R in cluster mode. RRunner seems > to just call Rscript and assumes its in the path. But on our YARN deployment > R isn't installed on the nodes so it needs to be distributed along with the > job and we need the ability to point to where it gets installed. sparkR in > client mode has the config spark.sparkr.r.command to point to Rscript. > RRunner should have something similar so it works in cluster mode -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-10971) sparkR: RRunner should allow setting path to Rscript
[ https://issues.apache.org/jira/browse/SPARK-10971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14973510#comment-14973510 ] Patrick Wendell commented on SPARK-10971: - Reynold has sent out the vote email based on the original fix. Since that vote is likely to pass, this patch will probably be in 1.5.3. > sparkR: RRunner should allow setting path to Rscript > > > Key: SPARK-10971 > URL: https://issues.apache.org/jira/browse/SPARK-10971 > Project: Spark > Issue Type: Bug > Components: SparkR >Affects Versions: 1.5.1 >Reporter: Thomas Graves >Assignee: Sun Rui > Fix For: 1.5.3, 1.6.0 > > > I'm running spark on yarn and trying to use R in cluster mode. RRunner seems > to just call Rscript and assumes its in the path. But on our YARN deployment > R isn't installed on the nodes so it needs to be distributed along with the > job and we need the ability to point to where it gets installed. sparkR in > client mode has the config spark.sparkr.r.command to point to Rscript. > RRunner should have something similar so it works in cluster mode -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-10971) sparkR: RRunner should allow setting path to Rscript
[ https://issues.apache.org/jira/browse/SPARK-10971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14972377#comment-14972377 ] Shivaram Venkataraman commented on SPARK-10971: --- [~pwendell] I see a commit for cutting 1.5.2 in branch-1.5 but I didn't see an email about it. The fix version 1.5.2 here should be updated if a release had been cut > sparkR: RRunner should allow setting path to Rscript > > > Key: SPARK-10971 > URL: https://issues.apache.org/jira/browse/SPARK-10971 > Project: Spark > Issue Type: Bug > Components: SparkR >Affects Versions: 1.5.1 >Reporter: Thomas Graves >Assignee: Sun Rui > Fix For: 1.5.2, 1.6.0 > > > I'm running spark on yarn and trying to use R in cluster mode. RRunner seems > to just call Rscript and assumes its in the path. But on our YARN deployment > R isn't installed on the nodes so it needs to be distributed along with the > job and we need the ability to point to where it gets installed. sparkR in > client mode has the config spark.sparkr.r.command to point to Rscript. > RRunner should have something similar so it works in cluster mode -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-10971) sparkR: RRunner should allow setting path to Rscript
[ https://issues.apache.org/jira/browse/SPARK-10971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14964983#comment-14964983 ] Apache Spark commented on SPARK-10971: -- User 'sun-rui' has created a pull request for this issue: https://github.com/apache/spark/pull/9179 > sparkR: RRunner should allow setting path to Rscript > > > Key: SPARK-10971 > URL: https://issues.apache.org/jira/browse/SPARK-10971 > Project: Spark > Issue Type: Bug > Components: SparkR >Affects Versions: 1.5.1 >Reporter: Thomas Graves > > I'm running spark on yarn and trying to use R in cluster mode. RRunner seems > to just call Rscript and assumes its in the path. But on our YARN deployment > R isn't installed on the nodes so it needs to be distributed along with the > job and we need the ability to point to where it gets installed. sparkR in > client mode has the config spark.sparkr.r.command to point to Rscript. > RRunner should have something similar so it works in cluster mode -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-10971) sparkR: RRunner should allow setting path to Rscript
[ https://issues.apache.org/jira/browse/SPARK-10971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14951696#comment-14951696 ] Sun Rui commented on SPARK-10971: - I agree that it is more flexible to allow configuration of location of RScript on both client and cluster modes. But I am not sure if it makes sense to distribute R itself onto worker nodes for jobs instead of have it installed on worker nodes, as R binary is platform specific (also may require platform specific installation steps), as well as performance cost of shipping R binaries. > sparkR: RRunner should allow setting path to Rscript > > > Key: SPARK-10971 > URL: https://issues.apache.org/jira/browse/SPARK-10971 > Project: Spark > Issue Type: Bug > Components: SparkR >Affects Versions: 1.5.1 >Reporter: Thomas Graves > > I'm running spark on yarn and trying to use R in cluster mode. RRunner seems > to just call Rscript and assumes its in the path. But on our YARN deployment > R isn't installed on the nodes so it needs to be distributed along with the > job and we need the ability to point to where it gets installed. sparkR in > client mode has the config spark.sparkr.r.command to point to Rscript. > RRunner should have something similar so it works in cluster mode -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-10971) sparkR: RRunner should allow setting path to Rscript
[ https://issues.apache.org/jira/browse/SPARK-10971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14948307#comment-14948307 ] Sun Rui commented on SPARK-10971: - just be curious: how do you distribute RScript to YARN nodes? Why not installing R in all YARN nodes so that it need not be distributed for each job to improve performance? > sparkR: RRunner should allow setting path to Rscript > > > Key: SPARK-10971 > URL: https://issues.apache.org/jira/browse/SPARK-10971 > Project: Spark > Issue Type: Bug > Components: SparkR >Affects Versions: 1.5.1 >Reporter: Thomas Graves > > I'm running spark on yarn and trying to use R in cluster mode. RRunner seems > to just call Rscript and assumes its in the path. But on our YARN deployment > R isn't installed on the nodes so it needs to be distributed along with the > job and we need the ability to point to where it gets installed. sparkR in > client mode has the config spark.sparkr.r.command to point to Rscript. > RRunner should have something similar so it works in cluster mode -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-10971) sparkR: RRunner should allow setting path to Rscript
[ https://issues.apache.org/jira/browse/SPARK-10971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14949167#comment-14949167 ] Felix Cheung commented on SPARK-10971: -- I think he is suggesting the path to R/Rscript to be configurable, but not distribute RScript itself > sparkR: RRunner should allow setting path to Rscript > > > Key: SPARK-10971 > URL: https://issues.apache.org/jira/browse/SPARK-10971 > Project: Spark > Issue Type: Bug > Components: SparkR >Affects Versions: 1.5.1 >Reporter: Thomas Graves > > I'm running spark on yarn and trying to use R in cluster mode. RRunner seems > to just call Rscript and assumes its in the path. But on our YARN deployment > R isn't installed on the nodes so it needs to be distributed along with the > job and we need the ability to point to where it gets installed. sparkR in > client mode has the config spark.sparkr.r.command to point to Rscript. > RRunner should have something similar so it works in cluster mode -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-10971) sparkR: RRunner should allow setting path to Rscript
[ https://issues.apache.org/jira/browse/SPARK-10971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14949199#comment-14949199 ] Thomas Graves commented on SPARK-10971: --- you shouldn't have to install everything a user needs on the YARN nodes. This can cause many different types of issues, the main one being version conflicts and a Maintenance head ache. The only downside to that is if you aren't using the distributed cache properly there is overhead in downloading that. Perhaps there are distributions that don't recommend or cases you want it installed for performance reasons but a general use YARN cluster needs to allow users to send their dependencies with their applications. So yes I am just suggesting the path to Rscript be configurable. You should be able to set a config like spark.sparkr.r.command to point to where Rscript is located. > sparkR: RRunner should allow setting path to Rscript > > > Key: SPARK-10971 > URL: https://issues.apache.org/jira/browse/SPARK-10971 > Project: Spark > Issue Type: Bug > Components: SparkR >Affects Versions: 1.5.1 >Reporter: Thomas Graves > > I'm running spark on yarn and trying to use R in cluster mode. RRunner seems > to just call Rscript and assumes its in the path. But on our YARN deployment > R isn't installed on the nodes so it needs to be distributed along with the > job and we need the ability to point to where it gets installed. sparkR in > client mode has the config spark.sparkr.r.command to point to Rscript. > RRunner should have something similar so it works in cluster mode -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org