This is an automated email from the ASF dual-hosted git repository.
angerszhuuuu pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/incubator-celeborn.git
The following commit(s) were added to refs/heads/main by this push:
new f15c2a7a6 [CELEBORN-814] Merge upgrade doc to Deployment tab and add
TOC
f15c2a7a6 is described below
commit f15c2a7a683d5ed26372cb4616a7e94bca0bc8ee
Author: Angerszhuuuu <[email protected]>
AuthorDate: Thu Jul 20 14:06:12 2023 +0800
[CELEBORN-814] Merge upgrade doc to Deployment tab and add TOC
### What changes were proposed in this pull request?
As title
<img width="1643" alt="截屏2023-07-20 下午12 01 06"
src="https://github.com/apache/incubator-celeborn/assets/46485123/d8822003-602f-4fe8-9634-ff25c0367cb1">
### Why are the changes needed?
### Does this PR introduce _any_ user-facing change?
### How was this patch tested?
Closes #1738 from AngersZhuuuu/CELEBORN-814.
Authored-by: Angerszhuuuu <[email protected]>
Signed-off-by: Angerszhuuuu <[email protected]>
---
docs/upgrade.md | 19 +++++++++----------
mkdocs.yml | 3 ++-
2 files changed, 11 insertions(+), 11 deletions(-)
diff --git a/docs/upgrade.md b/docs/upgrade.md
index 86365b599..edec9af08 100644
--- a/docs/upgrade.md
+++ b/docs/upgrade.md
@@ -1,7 +1,4 @@
---
-hide:
- - navigation
-
license: |
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
@@ -21,7 +18,7 @@ license: |
Upgrade
===
-# Rolling upgrade
+## Rolling upgrade
It is necessary to support a fast rolling upgrade process for the Celeborn
cluster.
In order to achieve a fast and unaffected rolling upgrade process,
@@ -29,7 +26,9 @@ Celeborn should support that the written file in the worker
should be committed
and support reading after the worker restarted. Celeborn have done the
following mechanism to support rolling upgrade.
-## Fixed fetch port and client retry
+### Background
+
+**Fixed fetch port and client retry**
In the shuffle reduce side, the read client will obtain the worker's host/port
and
information of the file to be read. In order to ensure that the data can be
read
@@ -45,7 +44,7 @@ The shuffle client fetch data retry times configuration is
`celeborn.client.fetc
The shuffle client fetch data retry wait time configuration is
`celeborn.data.io.retryWait`, default value is `5s`.
Users can increase the configuration value appropriately according to the
situation.
-## Worker store file meta information
+**Worker store file meta information**
Shuffle client records the shuffle partition location's host, service port,
and filename,
to support workers recovering reading existing shuffle data after worker
restart,
@@ -59,7 +58,7 @@ Then worker will wait for partition sorter finish all sort
task within a timeout
The whole graceful shutdown process should be finished within a timeout of
`celeborn.worker.graceful.shutdown.timeout`, which default value is `600s`.
-## Allocated partition do hard split and Pre-commit hard split partition
+**Allocated partition do hard split and Pre-commit hard split partition**
As mentioned in the previous section that the worker needs to wait for all
allocated partition files
to be committed during the restart process, which means that the worker need
to wait for all the shuffle
@@ -72,9 +71,9 @@ Then client side can record all HARD_SPLIT partition
information and pre-commit
then the worker side allocated partitions can be committed in a very short
time. User should enable
`celeborn.client.shuffle.batchHandleCommitPartition.enabled`, the default
value is false.
-## Example setting:
+### Example setting
-### Worker
+#### Worker
| Key | Value |
|-------------------------------------------------------------------|-------|
@@ -84,7 +83,7 @@ then the worker side allocated partitions can be committed in
a very short time.
| celeborn.worker.graceful.shutdown.timeout | 600s |
| celeborn.worker.fetch.port | 9092 |
-### Client
+#### Client
| Key | Value |
|------------------------------------------------------------------|-------|
diff --git a/mkdocs.yml b/mkdocs.yml
index 0955ef4c9..ce1342527 100644
--- a/mkdocs.yml
+++ b/mkdocs.yml
@@ -76,7 +76,8 @@ nav:
- Deployment:
- Overview: deploy.md
- Kubernetes: deploy_on_k8s.md
+ - Upgrade: upgrade.md
- Ratis Shell: celeborn_ratis_shell.md
- Monitoring: monitoring.md
- Migration Guide: migration.md
- - Upgrade: upgrade.md
+