[
https://issues.apache.org/jira/browse/SPARK-15781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15318221#comment-15318221
]
Jonathan Taws commented on SPARK-15781:
---------------------------------------
Before moving forward with the PR, I'd like to ask you your opinion on the
changes to the documentation :
As we are removing the SPARK_WORKER_INSTANCES from the settings table, one
replacement I've found would be to add the SPARK_EXECUTOR_INSTANCES, along with
SPARK_EXECUTOR_CORES and SPARK_EXECUTOR_MEMORY into the settings table to
coherent with the deprecated message posted above.
However, the settings table as it stands is based on worker settings, and
introducing executor settings might be confusing.
A work-around would be to add an extra settings table at the end of the
*Cluster Launcher Scripts* section to address those particular settings. As
such, it remains coherent with the other settings - all defined in
spark-env.sh, and it shouldn't be too confusing as it won't be in the middle of
worker settings.
> Misleading deprecated property in standalone cluster configuration
> documentation
> --------------------------------------------------------------------------------
>
> Key: SPARK-15781
> URL: https://issues.apache.org/jira/browse/SPARK-15781
> Project: Spark
> Issue Type: Documentation
> Components: Documentation
> Affects Versions: 1.6.1
> Reporter: Jonathan Taws
> Priority: Minor
>
> I am unsure if this is regarded as an issue or not, but in the
> [latest|http://spark.apache.org/docs/latest/spark-standalone.html#cluster-launch-scripts]
> documentation for the configuration to launch Spark in stand-alone cluster
> mode, the following property is documented :
> |SPARK_WORKER_INSTANCES| Number of worker instances to run on each
> machine (default: 1). You can make this more than 1 if you have have very
> large machines and would like multiple Spark worker processes. If you do set
> this, make sure to also set SPARK_WORKER_CORES explicitly to limit the cores
> per worker, or else each worker will try to use all the cores.|
> However, once I launch Spark with the spark-submit utility and the property
> {{SPARK_WORKER_INSTANCES}} set in my spark-env.sh file, I get the following
> deprecated warning :
> {code}
> 16/06/06 16:38:28 WARN SparkConf:
> SPARK_WORKER_INSTANCES was detected (set to '4').
> This is deprecated in Spark 1.0+.
> Please instead use:
> - ./spark-submit with --num-executors to specify the number of executors
> - Or set SPARK_EXECUTOR_INSTANCES
> - spark.executor.instances to configure the number of instances in the spark
> config.
> {code}
> Is this regarded as normal practice to have deprecated fields documented in
> the documentation ?
> I would have preferred to directly know about the --num-executors property
> than to have to submit my application and find a deprecated warning.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]