This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a commit to branch branch-3.1 in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/branch-3.1 by this push: new 6a04775 [SPARK-33891][DOCS][CORE] Update dynamic allocation related documents 6a04775 is described below commit 6a0477589015a2534c0f9b764cd3e3b3b39e4118 Author: Dongjoon Hyun <dh...@apple.com> AuthorDate: Wed Dec 23 23:43:21 2020 +0900 [SPARK-33891][DOCS][CORE] Update dynamic allocation related documents ### What changes were proposed in this pull request? This PR aims to update the followings. - Remove the outdated requirement for `spark.shuffle.service.enabled` in `configuration.md` - Dynamic allocation section in `job-scheduling.md` ### Why are the changes needed? To make the document up-to-date. ### Does this PR introduce _any_ user-facing change? No, it's a documentation update. ### How was this patch tested? Manual. **BEFORE** ![Screen Shot 2020-12-23 at 2 22 04 AM](https://user-images.githubusercontent.com/9700541/102986441-ae647f80-44c5-11eb-97a3-87c2d368952a.png) ![Screen Shot 2020-12-23 at 2 22 34 AM](https://user-images.githubusercontent.com/9700541/102986473-bcb29b80-44c5-11eb-8eae-6802001c6dfa.png) **AFTER** ![Screen Shot 2020-12-23 at 2 25 36 AM](https://user-images.githubusercontent.com/9700541/102986767-2df24e80-44c6-11eb-8540-e74856a4c313.png) ![Screen Shot 2020-12-23 at 2 21 13 AM](https://user-images.githubusercontent.com/9700541/102986366-8e34c080-44c5-11eb-8054-1efd07c9458c.png) Closes #30906 from dongjoon-hyun/SPARK-33891. Authored-by: Dongjoon Hyun <dh...@apple.com> Signed-off-by: HyukjinKwon <gurwls...@apache.org> (cherry picked from commit 47d1aa4e93f668774fd0b16c780d3b1f6200bcd8) Signed-off-by: HyukjinKwon <gurwls...@apache.org> --- docs/configuration.md | 3 +-- docs/job-scheduling.md | 17 +++++++++-------- 2 files changed, 10 insertions(+), 10 deletions(-) diff --git a/docs/configuration.md b/docs/configuration.md index 21506e6..fe1fc3e 100644 --- a/docs/configuration.md +++ b/docs/configuration.md @@ -928,8 +928,7 @@ Apart from these, the following properties are also available, and may be useful <td>false</td> <td> Enables the external shuffle service. This service preserves the shuffle files written by - executors so the executors can be safely removed. This must be enabled if - <code>spark.dynamicAllocation.enabled</code> is "true". The external shuffle service + executors so the executors can be safely removed. The external shuffle service must be set up in order to enable it. See <a href="job-scheduling.html#configuration-and-setup">dynamic allocation configuration and setup documentation</a> for more information. diff --git a/docs/job-scheduling.md b/docs/job-scheduling.md index 7c7385b..f2b77cd 100644 --- a/docs/job-scheduling.md +++ b/docs/job-scheduling.md @@ -79,18 +79,19 @@ are no longer used and request them again later when there is demand. This featu useful if multiple applications share resources in your Spark cluster. This feature is disabled by default and available on all coarse-grained cluster managers, i.e. -[standalone mode](spark-standalone.html), [YARN mode](running-on-yarn.html), and -[Mesos coarse-grained mode](running-on-mesos.html#mesos-run-modes). +[standalone mode](spark-standalone.html), [YARN mode](running-on-yarn.html), +[Mesos coarse-grained mode](running-on-mesos.html#mesos-run-modes) and [K8s mode](running-on-kubernetes.html). + ### Configuration and Setup -There are two requirements for using this feature. First, your application must set -`spark.dynamicAllocation.enabled` to `true`. Second, you must set up an *external shuffle service* -on each worker node in the same cluster and set `spark.shuffle.service.enabled` to true in your -application. The purpose of the external shuffle service is to allow executors to be removed +There are two ways for using this feature. +First, your application must set both `spark.dynamicAllocation.enabled` and `spark.dynamicAllocation.shuffleTracking.enabled` to `true`. +Second, your application must set both `spark.dynamicAllocation.enabled` and `spark.shuffle.service.enabled` to `true` +after you set up an *external shuffle service* on each worker node in the same cluster. +The purpose of the shuffle tracking or the external shuffle service is to allow executors to be removed without deleting shuffle files written by them (more detail described -[below](job-scheduling.html#graceful-decommission-of-executors)). The way to set up this service -varies across cluster managers: +[below](job-scheduling.html#graceful-decommission-of-executors)). While it is simple to enable shuffle tracking, the way to set up the external shuffle service varies across cluster managers: In standalone mode, simply start your workers with `spark.shuffle.service.enabled` set to `true`. --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org