This is an automated email from the ASF dual-hosted git repository.
jiangxb1987 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push:
new 0d4e4df [SPARK-31018][CORE][DOCS] Deprecate support of multiple
workers on the same host in Standalone
0d4e4df is described below
commit 0d4e4df06105cf2985dde17c1af76093b3ae8c13
Author: yi.wu <[email protected]>
AuthorDate: Wed Apr 15 11:29:55 2020 -0700
[SPARK-31018][CORE][DOCS] Deprecate support of multiple workers on the same
host in Standalone
### What changes were proposed in this pull request?
Update the document and shell script to warn user about the deprecation of
multiple workers on the same host support.
### Why are the changes needed?
This is a sub-task of
[SPARK-30978](https://issues.apache.org/jira/browse/SPARK-30978), which plans
to totally remove support of multiple workers in Spark 3.1. This PR makes the
first step to deprecate it firstly in Spark 3.0.
### Does this PR introduce any user-facing change?
Yeah, user see warning when they run start worker script.
### How was this patch tested?
Tested manually.
Closes #27768 from Ngone51/deprecate_spark_worker_instances.
Authored-by: yi.wu <[email protected]>
Signed-off-by: Xingbo Jiang <[email protected]>
---
docs/core-migration-guide.md | 2 ++
docs/hardware-provisioning.md | 8 ++++----
sbin/start-slave.sh | 2 +-
3 files changed, 7 insertions(+), 5 deletions(-)
diff --git a/docs/core-migration-guide.md b/docs/core-migration-guide.md
index 66a489b..cde6e07 100644
--- a/docs/core-migration-guide.md
+++ b/docs/core-migration-guide.md
@@ -38,3 +38,5 @@ license: |
- Event log file will be written as UTF-8 encoding, and Spark History Server
will replay event log files as UTF-8 encoding. Previously Spark wrote the event
log file as default charset of driver JVM process, so Spark History Server of
Spark 2.x is needed to read the old event log files in case of incompatible
encoding.
- A new protocol for fetching shuffle blocks is used. It's recommended that
external shuffle services be upgraded when running Spark 3.0 apps. You can
still use old external shuffle services by setting the configuration
`spark.shuffle.useOldFetchProtocol` to `true`. Otherwise, Spark may run into
errors with messages like `IllegalArgumentException: Unexpected message type:
<number>`.
+
+- `SPARK_WORKER_INSTANCES` is deprecated in Standalone mode. It's recommended
to launch multiple executors in one worker and launch one worker per node
instead of launching multiple workers per node and launching one executor per
worker.
diff --git a/docs/hardware-provisioning.md b/docs/hardware-provisioning.md
index 4e5d681..fc87995f 100644
--- a/docs/hardware-provisioning.md
+++ b/docs/hardware-provisioning.md
@@ -63,10 +63,10 @@ Note that memory usage is greatly affected by storage level
and serialization fo
the [tuning guide](tuning.html) for tips on how to reduce it.
Finally, note that the Java VM does not always behave well with more than 200
GiB of RAM. If you
-purchase machines with more RAM than this, you can run _multiple worker JVMs
per node_. In
-Spark's [standalone mode](spark-standalone.html), you can set the number of
workers per node
-with the `SPARK_WORKER_INSTANCES` variable in `conf/spark-env.sh`, and the
number of cores
-per worker with `SPARK_WORKER_CORES`.
+purchase machines with more RAM than this, you can launch multiple executors
in a single node. In
+Spark's [standalone mode](spark-standalone.html), a worker is responsible for
launching multiple
+executors according to its available memory and cores, and each executor will
be launched in a
+separate Java VM.
# Network
diff --git a/sbin/start-slave.sh b/sbin/start-slave.sh
index 2cb17a0..9b3b26b 100755
--- a/sbin/start-slave.sh
+++ b/sbin/start-slave.sh
@@ -22,7 +22,7 @@
# Environment Variables
#
# SPARK_WORKER_INSTANCES The number of worker instances to run on this
-# slave. Default is 1.
+# slave. Default is 1. Note it has been deprecate
since Spark 3.0.
# SPARK_WORKER_PORT The base port number for the first worker. If set,
# subsequent workers will increment this number. If
# unset, Spark will find a valid port number, but
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]