Jonathan Taws created SPARK-15781:
-------------------------------------
Summary: Misleading deprecated property in standalone cluster
configuration documentation
Key: SPARK-15781
URL: https://issues.apache.org/jira/browse/SPARK-15781
Project: Spark
Issue Type: Documentation
Components: Documentation
Affects Versions: 1.6.1
Reporter: Jonathan Taws
Priority: Minor
I am unsure if this is regarded as an issue or not, but in the
[latest|http://spark.apache.org/docs/latest/spark-standalone.html#cluster-launch-scripts]
documentation for the configuration to launch Spark in stand-alone cluster
mode, the following property is documented :
|SPARK_WORKER_INSTANCES| Number of worker instances to run on each
machine (default: 1). You can make this more than 1 if you have have very large
machines and would like multiple Spark worker processes. If you do set this,
make sure to also set SPARK_WORKER_CORES explicitly to limit the cores per
worker, or else each worker will try to use all the cores.|
However, once I launch Spark with the spark-submit utility and the property
{{SPARK_WORKER_INSTANCES}} set in my spark-env.sh file, I get the following
deprecated warning :
{code}
16/06/06 16:38:28 WARN SparkConf:
SPARK_WORKER_INSTANCES was detected (set to '4').
This is deprecated in Spark 1.0+.
Please instead use:
- ./spark-submit with --num-executors to specify the number of executors
- Or set SPARK_EXECUTOR_INSTANCES
- spark.executor.instances to configure the number of instances in the spark
config.
{code}
Is this regarded as normal practice to have deprecated fields documented in the
documentation ?
I would have preferred to directly know about the --num-executors property than
to have to submit my application and find a deprecated warning.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]