This is an automated email from the ASF dual-hosted git repository.
chengpan pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/incubator-celeborn-website.git
The following commit(s) were added to refs/heads/main by this push:
new 28d97de [CELEBORN-77] Doc for quick rolling upgrade (#1)
28d97de is described below
commit 28d97de1e260afe0d09b8f02cab4a23fff05cf9e
Author: Angerszhuuuu <[email protected]>
AuthorDate: Tue Dec 6 18:40:25 2022 +0800
[CELEBORN-77] Doc for quick rolling upgrade (#1)
* [CELEBORN-77] Doc for quick rolling upgrade
* update
* Update upgrade.md
* Update upgrade.md
---
docs/developer_guide/docs_and_website.md | 2 +-
docs/user_guide/upgrade.md | 90 ++++++++++++++++++++++++++++++++
mkdocs.yml | 2 +
3 files changed, 93 insertions(+), 1 deletion(-)
diff --git a/docs/developer_guide/docs_and_website.md
b/docs/developer_guide/docs_and_website.md
index 44a94bf..c5cdd11 100644
--- a/docs/developer_guide/docs_and_website.md
+++ b/docs/developer_guide/docs_and_website.md
@@ -14,7 +14,7 @@ license: |
limitations under the License.
---
-Buiding Docs and Website
+Building Docs and Website
===
## Setup Python
diff --git a/docs/user_guide/upgrade.md b/docs/user_guide/upgrade.md
new file mode 100644
index 0000000..0ba6003
--- /dev/null
+++ b/docs/user_guide/upgrade.md
@@ -0,0 +1,90 @@
+---
+license: |
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements. See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License. You may obtain a copy of the License at
+ http://www.apache.org/licenses/LICENSE-2.0
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+---
+
+
+Upgrade
+===
+
+# Rolling upgrade
+
+It is necessary to support a fast rolling upgrade process for the Celeborn
cluster.
+In order to achieve a fast and unaffected rolling upgrade process,
+Celeborn should support that the written file in the worker should be committed
+and support reading after the worker restarted. Celeborn have done the
+following mechanism to support rolling upgrade.
+
+## Fixed fetch port and client retry
+
+In the shuffle reduce side, the read client will obtain the worker's host/port
and
+information of the file to be read. In order to ensure that the data can be
read
+normally after the rolling restart process of the worker is completed,
+the worker needs to use a fixed fetch service port,
+the configuration is `celeborn.worker.fetch.port`, the default value is `0`.
+At startup, it will automatically select a free port, user need to set a fixed
value, such as `9092`.
+
+At the same time, users need to adjust the number of retry times and retry
wait time
+of the client according to cluster rolling restart situation
+to support the shuffle client to read data through retries after worker
restarted.
+The shuffle client fetch data retry times configuration is
`celeborn.fetch.maxRetries`, default value is `3`.
+The shuffle client fetch data retry wait time configuration is
`celeborn.data.io.retryWait`, default value is `5s`.
+Users can increase the configuration value appropriately according to the
situation.
+
+## Worker store file meta information
+
+Shuffle client records the shuffle partition location's host, service port,
and filename,
+to support workers recovering reading existing shuffle data after worker
restart,
+during worker shutdown, workers should store the meta about reading shuffle
partition files
+in LevelDB, and restore the meta after restarting workers.
+Users should set `celeborn.worker.graceful.shutdown.enabled` to `true` to
enable graceful shutdown.
+During this process, worker will wait all allocated partition's in this worker
to be committed
+within a timeout of
`celeborn.worker.graceful.shutdown.checkSlotsFinished.timeout`, which default
value is `480s`.
+Then worker will wait for partition sorter finish all sort task within a
timeout of
+`celeborn.worker.graceful.shutdown.partitionSorter.shutdownTimeout`, which
default value is `120s`.
+The whole graceful shutdown process should be finished within a timeout of
+`celeborn.worker.graceful.shutdown.timeout`, which default value is `600s`.
+
+## Allocated partition do hard split and Pre-commit hard split partition
+
+As mentioned in the previous section that the worker needs to wait for all
allocated partition files
+to be committed during the restart process, which means that the worker need
to wait for all the shuffle
+running on this worker to finish running before restarting the worker,
otherwise part of the information
+will be lost, and abnormal partition files are left, and reading cannot be
resumed.
+
+In order to speed up the restart process, worker let all push data requests
return the HARD_SPLIT flag
+during worker shutdown, and shuffle client will re-apply for a new partition
location for these allocated partitions.
+Then client side can record all HARD_SPLIT partition information and
pre-commit these partition,
+then the worker side allocated partitions can be committed in a very short
time. User should enable
+`celeborn.shuffle.batchHandleCommitPartition.enabled`, the default value is
false.
+
+## Example setting:
+
+### Worker
+
+| Key | Value |
+|-------------------------------------------------------------------|-------|
+| celeborn.worker.graceful.shutdown.enabled | true |
+| celeborn.worker.graceful.shutdown.checkSlotsFinished.timeout | 480s |
+| celeborn.worker.graceful.shutdown.partitionSorter.shutdownTimeout | 120s |
+| celeborn.worker.graceful.shutdown.timeout | 600s |
+| celeborn.worker.fetch.port | 9092 |
+
+### Client
+
+| Key | Value |
+|-----------------------------------------------------------|-------|
+| spark.celeborn.shuffle.batchHandleCommitPartition.enabled | true |
+| spark.celeborn.fetch.maxRetries | 5 |
+| spark.celeborn.data.io.retryWait | 10s |
diff --git a/mkdocs.yml b/mkdocs.yml
index 271f151..b12c963 100644
--- a/mkdocs.yml
+++ b/mkdocs.yml
@@ -79,6 +79,8 @@ nav:
- Developer Guide:
- Build and Test: developer_guide/build_and_test.md
- Docs and Website: developer_guide/docs_and_website.md
+ - User Guide:
+ - Upgrade: user_guide/upgrade.md
- Apache Software Foundation:
- Foundation: asf/asf.md
- Disclaimer: asf/disclaimer.md