Github user aarondav commented on a diff in the pull request:

    https://github.com/apache/spark/pull/116#discussion_r10675622
  
    --- Diff: bin/spark-shell ---
    @@ -30,69 +30,378 @@ esac
     # Enter posix mode for bash
     set -o posix
     
    -CORE_PATTERN="^[0-9]+$"
    -MEM_PATTERN="^[0-9]+[m|g|M|G]$"
    -
    +## Global script variables
     FWDIR="$(cd `dirname $0`/..; pwd)"
     
    -if [ "$1" = "--help" ] || [ "$1" = "-h" ]; then
    -   echo "Usage: spark-shell [OPTIONS]"
    -   echo "OPTIONS:"
    -   echo "-c --cores num, the maximum number of cores to be used by the 
spark shell"
    -   echo "-em --execmem num[m|g], the memory used by each executor of spark 
shell"
    -   echo "-dm --drivermem num[m|g], the memory used by the spark shell and 
driver"
    -   echo "-h --help, print this help information" 
    -   exit
    -fi
    +VERBOSE=0
    +DRY_RUN=0
    +SPARK_REPL_OPTS="${SPARK_REPL_OPTS:-""}"
    +MASTER=""
    +
    +#CLI Color Templates
    +txtund=$(tput sgr 0 1)          # Underline
    +txtbld=$(tput bold)             # Bold
    +bldred=${txtbld}$(tput setaf 1) # red
    +bldyel=${txtbld}$(tput setaf 3) # yellow
    +bldblu=${txtbld}$(tput setaf 4) # blue
    +bldwht=${txtbld}$(tput setaf 7) # white
    +txtrst=$(tput sgr0)             # Reset
    +info=${bldwht}*${txtrst}        # Feedback
    +pass=${bldblu}*${txtrst}
    +warn=${bldred}*${txtrst}
    +ques=${bldblu}?${txtrst}
    +
    +# Helper function to describe the script usage
    +function usage() {
    +    cat << EOF
    +
    +${txtbld}Usage${txtrst}: spark-shell [OPTIONS]
    +
    +${txtbld}OPTIONS${txtrst}:
    +
    +${txtund}Basic${txtrst}:
    +
    +    -h  --help              : Print this help information.
    +    -c  --executor-cores    : The maximum number of cores to be used by 
the Spark Shell.
    +    -em --executor-memory   : The memory used by each executor of the 
Spark Shell, the number 
    +                              is followed by m for megabytes or g for 
gigabytes, e.g. "1g".
    +    -dm --driver-memory     : The memory used by the Spark Shell, the 
number is followed 
    +                              by m for megabytes or g for gigabytes, e.g. 
"1g".
    +
    +${txtund}Soon to be deprecated${txtrst}:
    +
    +    --cores : please use -c/--executor-cores
    +
    +${txtund}Other options${txtrst}:
    +
    +    -mip --master-ip     : The Spark Master ip/hostname.
    +    -mp  --master-port   : The Spark Master port.
    +    -m   --master        : A full string that describes the Spark Master, 
e.g. "local" or "spark://localhost:7077".
    +    -ld  --local-dir     : The absolute path to a local directory that 
will be use for "scratch" space in Spark.
    --- End diff --
    
    I think we should avoid replicating too many configuration options, 
especially those that can be configured as part of the Spark properties, as 
those should probably be configured in spark-env.sh for use on your entire 
cluster. It is definitely worthwhile to be able to configure environment 
variables, particularly those that are used only be the shell and nothing else, 
but I think there are a few properties which we don't need to have command-line 
options for:
    
    - local-dir
    - locality-wait
    - schedule-fair
    - max-failures
    --mesos-coarse (this is only useful for the mesos scheduler and is probably 
a pretty serious decision one should make, rather than just being a transient 
option per shell)
    
    Please feel free to fight back if you disagree on some of these options -- 
I just feel they don't need to be configured sufficiently often to be a 
command-line option, or that they have implications beyond a single shell 
session.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to