Github user aarondav commented on a diff in the pull request:

    https://github.com/apache/spark/pull/116#discussion_r10675685
  
    --- Diff: bin/spark-shell ---
    @@ -30,69 +30,378 @@ esac
     # Enter posix mode for bash
     set -o posix
     
    -CORE_PATTERN="^[0-9]+$"
    -MEM_PATTERN="^[0-9]+[m|g|M|G]$"
    -
    +## Global script variables
     FWDIR="$(cd `dirname $0`/..; pwd)"
     
    -if [ "$1" = "--help" ] || [ "$1" = "-h" ]; then
    -   echo "Usage: spark-shell [OPTIONS]"
    -   echo "OPTIONS:"
    -   echo "-c --cores num, the maximum number of cores to be used by the 
spark shell"
    -   echo "-em --execmem num[m|g], the memory used by each executor of spark 
shell"
    -   echo "-dm --drivermem num[m|g], the memory used by the spark shell and 
driver"
    -   echo "-h --help, print this help information" 
    -   exit
    -fi
    +VERBOSE=0
    +DRY_RUN=0
    +SPARK_REPL_OPTS="${SPARK_REPL_OPTS:-""}"
    +MASTER=""
    +
    +#CLI Color Templates
    +txtund=$(tput sgr 0 1)          # Underline
    +txtbld=$(tput bold)             # Bold
    +bldred=${txtbld}$(tput setaf 1) # red
    +bldyel=${txtbld}$(tput setaf 3) # yellow
    +bldblu=${txtbld}$(tput setaf 4) # blue
    +bldwht=${txtbld}$(tput setaf 7) # white
    +txtrst=$(tput sgr0)             # Reset
    +info=${bldwht}*${txtrst}        # Feedback
    +pass=${bldblu}*${txtrst}
    +warn=${bldred}*${txtrst}
    +ques=${bldblu}?${txtrst}
    +
    +# Helper function to describe the script usage
    +function usage() {
    +    cat << EOF
    +
    +${txtbld}Usage${txtrst}: spark-shell [OPTIONS]
    +
    +${txtbld}OPTIONS${txtrst}:
    +
    +${txtund}Basic${txtrst}:
    +
    +    -h  --help              : Print this help information.
    +    -c  --executor-cores    : The maximum number of cores to be used by 
the Spark Shell.
    +    -em --executor-memory   : The memory used by each executor of the 
Spark Shell, the number 
    +                              is followed by m for megabytes or g for 
gigabytes, e.g. "1g".
    +    -dm --driver-memory     : The memory used by the Spark Shell, the 
number is followed 
    +                              by m for megabytes or g for gigabytes, e.g. 
"1g".
    +
    +${txtund}Soon to be deprecated${txtrst}:
    +
    +    --cores : please use -c/--executor-cores
    +
    +${txtund}Other options${txtrst}:
    +
    +    -mip --master-ip     : The Spark Master ip/hostname.
    +    -mp  --master-port   : The Spark Master port.
    +    -m   --master        : A full string that describes the Spark Master, 
e.g. "local" or "spark://localhost:7077".
    +    -ld  --local-dir     : The absolute path to a local directory that 
will be use for "scratch" space in Spark.
    +    -dh  --driver-host   : Hostname or ip address for the driver to listen 
on.
    +    -dp  --driver-port   : The port for the driver to listen on.
    +    -uip --ui-port       : The port for your application's dashboard, 
which shows memory and workload data.
    +    --parallelism        : The default number of tasks to use across the 
cluster for distributed shuffle operations.
    --- End diff --
    
    I'm not sure this should be configurable here, since we try to choose a 
smart default based on the number of cores, but it should probably be called 
default-parallelism if it is, since all shuffle operations have a parameter 
which overrides this. We don't want to confuse users into thinking that this is 
some general setting regarding how parallel Spark is willing to be.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to