This is an automated email from the ASF dual-hosted git repository.
zhouky pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/incubator-celeborn.git
The following commit(s) were added to refs/heads/main by this push:
new 3ee067405 [CELEBORN-869][FOLLOWUP][DOC] Document on Integrating
Celeborn
3ee067405 is described below
commit 3ee0674058a098350b79e0b16810b1ee615f37d9
Author: zky.zhoukeyong <[email protected]>
AuthorDate: Wed Aug 2 18:17:17 2023 +0800
[CELEBORN-869][FOLLOWUP][DOC] Document on Integrating Celeborn
### What changes were proposed in this pull request?
As title.
### Why are the changes needed?
As title.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Manual test.
Closes #1788 from waitinfuture/869-fu.
Authored-by: zky.zhoukeyong <[email protected]>
Signed-off-by: zky.zhoukeyong <[email protected]>
---
docs/developers/integrate.md | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/docs/developers/integrate.md b/docs/developers/integrate.md
index 25b71d231..d7f2882fb 100644
--- a/docs/developers/integrate.md
+++ b/docs/developers/integrate.md
@@ -112,7 +112,7 @@ maintains status and reuses resource across all shuffles.
To make it work, you h
In practice, one `ShuffleClient` instance is created in each Executor process
of Spark, or in each TaskManager
process of Flink.
-## Step Three: Push Data
+## Step Four: Push Data
You can then push data with `ShuffleClient` with
[pushData](../../developers/shuffleclient#api-specification), like
the following:
@@ -145,11 +145,12 @@ public abstract void mapperEnd(
int numMappers)
```
+- `shuffleId` shuffle id of the current task
- `mapId` map id of the current task
- `attemptId` attempt id of the current task
- `numMappers` number of map ids in this shuffle
-## Step Four: Read Data
+## Step Five: Read Data
After all tasks successfully called `mapperEnd`, you can start reading data
from some partition id, using the
[readPartition API](../../developers/shuffleclient#api-specification_1), as
the following code:
@@ -168,7 +169,7 @@ For simplicity, to read the whole data from the partition,
you can pass 0 and `I
and `endMapIndex`. This method will create an InputStream for the data, and
guarantees no data lost and no
duplicate reading, else exception will be thrown.
-## Step Five: Clean Up
+## Step Six: Clean Up
After the shuffle finishes, you can call `LifecycleManager.unregisterShuffle`
to clean up resources related to the
shuffle: