This is an automated email from the ASF dual-hosted git repository.
nicholasjiang pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/celeborn.git
The following commit(s) were added to refs/heads/main by this push:
new ec67366b7 [CELEBORN-1684] Fix ambiguous client jar expression of
document
ec67366b7 is described below
commit ec67366b7aba7af55d0452fb5ba9b5656fd57bc3
Author: szt <[email protected]>
AuthorDate: Tue Nov 5 13:48:22 2024 +0800
[CELEBORN-1684] Fix ambiguous client jar expression of document
### What changes were proposed in this pull request?
When users deploy using the release binary as outlined in the
documentation, the instructions for copying the client JAR can be unclear.
### Why are the changes needed?
No
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?

Closes #2877 from zaynt4606/md.
Authored-by: szt <[email protected]>
Signed-off-by: SteNicholas <[email protected]>
---
docs/README.md | 8 ++++----
docs/deploy.md | 13 ++++++++++---
2 files changed, 14 insertions(+), 7 deletions(-)
diff --git a/docs/README.md b/docs/README.md
index 64c50357d..6597d6005 100644
--- a/docs/README.md
+++ b/docs/README.md
@@ -87,7 +87,7 @@ WorkerRef: null
Celeborn release binary contains clients for Spark 2.x and Spark 3.x, copy the
corresponding client jar into Spark's
`jars/` directory:
```shell
-cp $CELEBORN_HOME/spark/<Celeborn Client Jar> $SPARK_HOME/jars/
+cp
$CELEBORN_HOME/spark/celeborn-client-spark-<spark.major.version>-shaded_<scala.binary.version>-<celeborn.version>.jar
$SPARK_HOME/jars/
```
#### Start spark-shell
Set `spark.shuffle.manager` to Celeborn's ShuffleManager, and turn off
`spark.shuffle.service.enabled`:
@@ -125,7 +125,7 @@ INFO [async-reply] Controller: CommitFiles for
local-1690000152711-0 success wit
Celeborn release binary contains clients for Flink 1.14.x, Flink 1.15.x, Flink
1.17.x, Flink 1.18.x, Flink 1.19.x and Flink 1.20.x, copy the corresponding
client jar into Flink's
`lib/` directory:
```shell
-cp $CELEBORN_HOME/flink/<Celeborn Client Jar> $FLINK_HOME/lib/
+cp
$CELEBORN_HOME/flink/celeborn-client-flink-<flink.version>-shaded_<scala.binary.version>-<celeborn.version>.jar
$FLINK_HOME/lib/
```
#### Add Celeborn configuration to Flink's conf
Set `shuffle-service-factory.class` to Celeborn's ShuffleServiceFactory in
Flink configuration file:
@@ -181,8 +181,8 @@ INFO [async-reply] Controller: CommitFiles for
local-1690000152711-0 success wit
### Copy Celeborn Client to MapReduce's classpath
1. Copy `$CELEBORN_HOME/mr/*.jar` into `mapreduce.application.classpath` and
`yarn.application.classpath`.
```shell
-cp $CELEBORN_HOME/mr/<Celeborn Client Jar> <mapreduce.application.classpath>
-cp $CELEBORN_HOME/mr/<Celeborn Client Jar> <yarn.application.classpath>
+cp
$CELEBORN_HOME/mr/celeborn-client-mr-shaded_<scala.binary.version>-<celeborn.version>.jar
<mapreduce.application.classpath>
+cp
$CELEBORN_HOME/mr/celeborn-client-mr-shaded_<scala.binary.version>-<celeborn.version>.jar
<yarn.application.classpath>
```
2. Restart your yarn cluster.
### Add Celeborn configuration to MapReduce's conf
diff --git a/docs/deploy.md b/docs/deploy.md
index e67b827da..30909603f 100644
--- a/docs/deploy.md
+++ b/docs/deploy.md
@@ -155,7 +155,10 @@ WorkerRef: null
```
## Deploy Spark client
-Copy `$CELEBORN_HOME/spark/*.jar` to `$SPARK_HOME/jars/`.
+Celeborn release binary contains clients for Spark 2.x and Spark 3.x, copy the
corresponding client jar into Spark's
+`jars/` directory:
+
+Copy
`$CELEBORN_HOME/spark/celeborn-client-spark-<spark.major.version>-shaded_<scala.binary.version>-<celeborn.version>.jar`
to `$SPARK_HOME/jars/`.
### Spark Configuration
To use Celeborn, the following spark configurations should be added.
@@ -209,7 +212,10 @@ spark.executor.userClassPathFirst false
**Important: Only Flink batch jobs are supported for now.**
-Copy `$CELEBORN_HOME/flink/*.jar` to `$FLINK_HOME/lib/`.
+Celeborn release binary contains clients for Flink 1.14.x, Flink 1.15.x, Flink
1.17.x, Flink 1.18.x, Flink 1.19.x and Flink 1.20.x, copy the corresponding
client jar into Flink's
+`lib/` directory:
+
+Copy
`$CELEBORN_HOME/flink/celeborn-client-flink-<flink.version>-shaded_<scala.binary.version>-<celeborn.version>.jar`
to `$FLINK_HOME/lib/`.
### Flink Configuration
Celeborn supports two Flink integration strategies: remote shuffle service
(since Flink 1.14) and [hybrid
shuffle](https://nightlies.apache.org/flink/flink-docs-stable/docs/ops/batch/batch_shuffle/#hybrid-shuffle)
(since Flink 1.20).
@@ -259,7 +265,8 @@ celeborn.rpc.dispatcher.numThreads: 32
**Note**: The config option `execution.batch-shuffle-mode` should configure as
`ALL_EXCHANGES_HYBRID_FULL`.
## Deploy MapReduce client
-Copy `$CELEBORN_HOME/mr/*.jar` into `mapreduce.application.classpath` and
`yarn.application.classpath`.
+Copy
`$CELEBORN_HOME/mr/celeborn-client-mr-shaded_<scala.binary.version>-<celeborn.version>.jar`
into `mapreduce.application.classpath` and `yarn.application.classpath`.
+
Meanwhile, configure the following settings in YARN and MapReduce config.
```bash
-Dyarn.app.mapreduce.am.job.recovery.enable=false