This is an automated email from the ASF dual-hosted git repository.
ethanfeng pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/celeborn.git
The following commit(s) were added to refs/heads/main by this push:
new 9083dd401 [CELEBORN-1504][FOLLOWUP] Document adds Flink 1.16 support
9083dd401 is described below
commit 9083dd401c717708881ae3a47271a1889a55f133
Author: SteNicholas <[email protected]>
AuthorDate: Wed Nov 13 21:47:29 2024 +0800
[CELEBORN-1504][FOLLOWUP] Document adds Flink 1.16 support
### What changes were proposed in this pull request?
1. Document adds Flink 1.16 support including `README.md`, `deploy.md`.
2. Update description of `celeborn.client.shuffle.compression.codec` to
change the supported Flink version for ZSTD.
### Why are the changes needed?
#2619 has supported Flink 1.16, which should update the document for the
support. Meanwhile, since Flink version 1.16, zstd is supported for Flink
shuffle client.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
No.
Closes #2904 from SteNicholas/CELEBORN-1504.
Authored-by: SteNicholas <[email protected]>
Signed-off-by: mingji <[email protected]>
---
README.md | 3 ++-
common/src/main/scala/org/apache/celeborn/common/CelebornConf.scala | 2 +-
docs/README.md | 4 ++--
docs/configuration/client.md | 2 +-
docs/deploy.md | 2 +-
docs/developers/overview.md | 2 +-
6 files changed, 8 insertions(+), 7 deletions(-)
diff --git a/README.md b/README.md
index 17c625c06..2ec14888e 100644
--- a/README.md
+++ b/README.md
@@ -41,7 +41,7 @@ Celeborn Worker's slot count is decided by `total usable disk
size / average shu
Celeborn worker's slot count decreases when a partition is allocated and
increments when a partition is freed.
## Build
-1. Celeborn supports Spark 2.4/3.0/3.1/3.2/3.3/3.4/3.5, Flink
1.14/1.15/1.17/1.18/1.19/1.20 and Hadoop MapReduce 2/3.
+1. Celeborn supports Spark 2.4/3.0/3.1/3.2/3.3/3.4/3.5, Flink
1.14/1.15/1.16/1.17/1.18/1.19/1.20 and Hadoop MapReduce 2/3.
2. Celeborn tested under Scala 2.11/2.12/2.13 and Java 8/11/17 environment.
Build Celeborn via `make-distribution.sh`:
@@ -64,6 +64,7 @@ Package `apache-celeborn-${project.version}-bin.tgz` will be
generated.
| Spark 3.5 | ❌ | ✔ | ✔ |
✔ | ✔ | ✔ | ✔
|
| Flink 1.14 | ❌ | ✔ | ✔ |
❌ | ❌ | ❌ | ❌
|
| Flink 1.15 | ❌ | ✔ | ✔ |
❌ | ❌ | ❌ | ❌
|
+| Flink 1.16 | ❌ | ✔ | ✔ |
❌ | ❌ | ❌ | ❌
|
| Flink 1.17 | ❌ | ✔ | ✔ |
❌ | ❌ | ❌ | ❌
|
| Flink 1.18 | ❌ | ✔ | ✔ |
❌ | ❌ | ❌ | ❌
|
| Flink 1.19 | ❌ | ✔ | ✔ |
❌ | ❌ | ❌ | ❌
|
diff --git
a/common/src/main/scala/org/apache/celeborn/common/CelebornConf.scala
b/common/src/main/scala/org/apache/celeborn/common/CelebornConf.scala
index e39a7b521..ed1301699 100644
--- a/common/src/main/scala/org/apache/celeborn/common/CelebornConf.scala
+++ b/common/src/main/scala/org/apache/celeborn/common/CelebornConf.scala
@@ -4643,7 +4643,7 @@ object CelebornConf extends Logging {
.categories("client")
.doc("The codec used to compress shuffle data. By default, Celeborn
provides three codecs: `lz4`, `zstd`, `none`. " +
"`none` means that shuffle compression is disabled. " +
- "Since Flink version 1.17, zstd is supported for Flink shuffle
client.")
+ "Since Flink version 1.16, zstd is supported for Flink shuffle
client.")
.version("0.3.0")
.stringConf
.transform(_.toUpperCase(Locale.ROOT))
diff --git a/docs/README.md b/docs/README.md
index 6597d6005..ea61b7e9c 100644
--- a/docs/README.md
+++ b/docs/README.md
@@ -122,7 +122,7 @@ INFO [async-reply] Controller: CommitFiles for
local-1690000152711-0 success wit
**Important: Only Flink batch jobs are supported for now.**
#### Copy Celeborn Client to Flink's lib
-Celeborn release binary contains clients for Flink 1.14.x, Flink 1.15.x, Flink
1.17.x, Flink 1.18.x, Flink 1.19.x and Flink 1.20.x, copy the corresponding
client jar into Flink's
+Celeborn release binary contains clients for Flink 1.14.x, Flink 1.15.x, Flink
1.16.x, Flink 1.17.x, Flink 1.18.x, Flink 1.19.x and Flink 1.20.x, copy the
corresponding client jar into Flink's
`lib/` directory:
```shell
cp
$CELEBORN_HOME/flink/celeborn-client-flink-<flink.version>-shaded_<scala.binary.version>-<celeborn.version>.jar
$FLINK_HOME/lib/
@@ -130,7 +130,7 @@ cp
$CELEBORN_HOME/flink/celeborn-client-flink-<flink.version>-shaded_<scala.bina
#### Add Celeborn configuration to Flink's conf
Set `shuffle-service-factory.class` to Celeborn's ShuffleServiceFactory in
Flink configuration file:
-- Flink 1.14.x, Flink 1.15.x, Flink 1.17.x, Flink 1.18.x
+- Flink 1.14.x, Flink 1.15.x, Flink 1.16.x, Flink 1.17.x, Flink 1.18.x
```shell
cd $FLINK_HOME
vi conf/flink-conf.yaml
diff --git a/docs/configuration/client.md b/docs/configuration/client.md
index fb7d353f4..3a6301f54 100644
--- a/docs/configuration/client.md
+++ b/docs/configuration/client.md
@@ -92,7 +92,7 @@ license: |
| celeborn.client.shuffle.batchHandleReleasePartition.interval | 5s | false |
Interval for LifecycleManager to schedule handling release partition requests
in batch. | 0.3.0 | |
| celeborn.client.shuffle.batchHandleReleasePartition.threads | 8 | false |
Threads number for LifecycleManager to handle release partition request in
batch. | 0.3.0 | |
| celeborn.client.shuffle.batchHandleRemoveExpiredShuffles.enabled | false |
false | Whether to batch remove expired shuffles. This is an optimization
switch on removing expired shuffles. | 0.6.0 | |
-| celeborn.client.shuffle.compression.codec | LZ4 | false | The codec used to
compress shuffle data. By default, Celeborn provides three codecs: `lz4`,
`zstd`, `none`. `none` means that shuffle compression is disabled. Since Flink
version 1.17, zstd is supported for Flink shuffle client. | 0.3.0 |
celeborn.shuffle.compression.codec,remote-shuffle.job.compression.codec |
+| celeborn.client.shuffle.compression.codec | LZ4 | false | The codec used to
compress shuffle data. By default, Celeborn provides three codecs: `lz4`,
`zstd`, `none`. `none` means that shuffle compression is disabled. Since Flink
version 1.16, zstd is supported for Flink shuffle client. | 0.3.0 |
celeborn.shuffle.compression.codec,remote-shuffle.job.compression.codec |
| celeborn.client.shuffle.compression.zstd.level | 1 | false | Compression
level for Zstd compression codec, its value should be an integer between -5 and
22. Increasing the compression level will result in better compression at the
expense of more CPU and memory. | 0.3.0 |
celeborn.shuffle.compression.zstd.level |
| celeborn.client.shuffle.decompression.lz4.xxhash.instance |
<undefined> | false | Decompression XXHash instance for Lz4. Available
options: JNI, JAVASAFE, JAVAUNSAFE. | 0.3.2 | |
| celeborn.client.shuffle.dynamicResourceEnabled | false | false | When
enabled, the ChangePartitionManager will obtain candidate workers from the
availableWorkers pool during heartbeats when worker resource change. | 0.6.0 |
|
diff --git a/docs/deploy.md b/docs/deploy.md
index d151ef494..1400e56cc 100644
--- a/docs/deploy.md
+++ b/docs/deploy.md
@@ -212,7 +212,7 @@ spark.executor.userClassPathFirst false
**Important: Only Flink batch jobs are supported for now. Due to the Shuffle
Service in Flink is cluster-granularity, if you want to use Celeborn in a
session cluster, it will not be able to submit both streaming and batch job to
the same cluster. We plan to get rid of this restriction for Hybrid Shuffle
mode in a future release.**
-Celeborn release binary contains clients for Flink 1.14.x, Flink 1.15.x, Flink
1.17.x, Flink 1.18.x, Flink 1.19.x and Flink 1.20.x, copy the corresponding
client jar into Flink's
+Celeborn release binary contains clients for Flink 1.14.x, Flink 1.15.x, Flink
1.16.x, Flink 1.17.x, Flink 1.18.x, Flink 1.19.x and Flink 1.20.x, copy the
corresponding client jar into Flink's
`lib/` directory:
Copy
`$CELEBORN_HOME/flink/celeborn-client-flink-<flink.version>-shaded_<scala.binary.version>-<celeborn.version>.jar`
to `$FLINK_HOME/lib/`.
diff --git a/docs/developers/overview.md b/docs/developers/overview.md
index 1b05f0e35..c7b866b01 100644
--- a/docs/developers/overview.md
+++ b/docs/developers/overview.md
@@ -89,7 +89,7 @@ Celeborn's primary components(i.e. Master, Worker, Client)
are engine irrelevant
and easy to implement plugins for various engines.
Currently, Celeborn officially supports
[Spark](https://spark.apache.org/)(both Spark 2.x and Spark 3.x),
-[Flink](https://flink.apache.org/)(1.14/1.15/1.17/1.18/1.19), and
+[Flink](https://flink.apache.org/)(1.14/1.15/1.16/1.17/1.18/1.19), and
[Gluten](https://github.com/apache/incubator-gluten). Also, developers are
integrating Celeborn with other engines,
for example [MR3](https://mr3docs.datamonad.com/docs/mr3/).