Github user yanboliang commented on a diff in the pull request:

    https://github.com/apache/spark/pull/15888#discussion_r88214305
  
    --- Diff: R/pkg/R/sparkR.R ---
    @@ -550,24 +566,28 @@ processSparkPackages <- function(packages) {
     #
     # @param sparkHome directory to find Spark package.
     # @param master the Spark master URL, used to check local or remote mode.
    +# @param deployMode whether to deploy your driver on the worker nodes 
(cluster) or
    +#        locally as an external client (client).
     # @return NULL if no need to update sparkHome, and new sparkHome otherwise.
    -sparkCheckInstall <- function(sparkHome, master) {
    +sparkCheckInstall <- function(sparkHome, master, deployMode) {
       if (!isSparkRShell()) {
         if (!is.na(file.info(sparkHome)$isdir)) {
           msg <- paste0("Spark package found in SPARK_HOME: ", sparkHome)
           message(msg)
           NULL
         } else {
    -      if (!nzchar(master) || isMasterLocal(master)) {
    +      if (isMasterLocal(master)) {
             msg <- paste0("Spark not found in SPARK_HOME: ",
                           sparkHome)
             message(msg)
             packageLocalDir <- install.spark()
             packageLocalDir
    -      } else {
    +      } else if (nzchar(deployMode) && deployMode == "client") {
             msg <- paste0("Spark not found in SPARK_HOME: ",
                           sparkHome, "\n", installInstruction("remote"))
             stop(msg)
    +      } else {
    +        NULL
    --- End diff --
    
    @felixcheung 
    If we submit SparkR jobs by ```spark-submit```, then ```master``` and 
```deployMode``` should be empty. ```sparkHome``` should be set correctly in 
```client``` and ```local``` mode, but empty in ```cluster``` mode. 
```sparkCheckInstall``` return NULL in all these scenarios.
    If we start SparkR by ```SparkR.session``` and pass in corresponding 
arguments, we should download Spark package when ```master = local[*]```. If 
```deployMode``` was set with ```client``` and w/o ```SPARK_HOME``` set 
correctly, we should stop due to misconfigure. Otherwise(deploy with ```yarn``` 
or ```mesos``` cluster mode) return NULL here.
    From Spark 2.0, ```master = yarn-cluster``` was deprecated by ```master = 
yarn, deployMode = cluster```. Since ```SparkR.session``` was introduced from 
2.0, I don't think we should support the old convention here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to