This is an automated email from the ASF dual-hosted git repository.
nicholasjiang pushed a commit to branch branch-0.4
in repository https://gitbox.apache.org/repos/asf/celeborn.git
The following commit(s) were added to refs/heads/branch-0.4 by this push:
new 641a802e2 [INFRA] Remove incubator/incubating for graduation
641a802e2 is described below
commit 641a802e2c14ca1da8e1651fe46f6948c4104e21
Author: SteNicholas <[email protected]>
AuthorDate: Wed Mar 27 13:54:47 2024 +0800
[INFRA] Remove incubator/incubating for graduation
### What changes were proposed in this pull request?
Remove incubator/incubating for graduation including:
- Remove `incubator`/`Incubating`.
- Remove `DISCLAIMER` and corresponding link.
- Update Release scripts and template.
Fix #2415.
### Why are the changes needed?
The ASF board has approved a resolution to graduate Celeborn into a full
Top Level Project. To transition from the Apache Incubator to a new TLP,
there's a few action items we need to do to complete the transition.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
No.
Closes #2421 from SteNicholas/infra-graduation.
Authored-by: SteNicholas <[email protected]>
Signed-off-by: mingji <[email protected]>
(cherry picked from commit c9b878a2f5b719d5c5fe3d68cd4e43b53ec25c14)
Signed-off-by: SteNicholas <[email protected]>
---
DISCLAIMER | 10 ----
NOTICE | 2 +-
NOTICE-binary | 2 +-
README.md | 55 ++++++++++------------
build/make-distribution.sh | 1 -
build/release/release.sh | 4 +-
.../src/main/resources/META-INF/NOTICE | 2 +-
.../src/main/resources/META-INF/NOTICE | 2 +-
.../src/main/resources/META-INF/NOTICE | 2 +-
.../src/main/resources/META-INF/NOTICE | 2 +-
.../mr-shaded/src/main/resources/META-INF/NOTICE | 2 +-
.../src/main/resources/META-INF/NOTICE | 2 +-
.../src/main/resources/META-INF/NOTICE | 2 +-
dev/merge_pr.py | 4 +-
docs/README.md | 27 ++++++-----
docs/developers/glutensupport.md | 14 +++---
docs/developers/overview.md | 6 +--
mkdocs.yml | 20 ++------
project/CelebornBuild.scala | 4 +-
19 files changed, 71 insertions(+), 92 deletions(-)
diff --git a/DISCLAIMER b/DISCLAIMER
deleted file mode 100644
index 0e5e17dde..000000000
--- a/DISCLAIMER
+++ /dev/null
@@ -1,10 +0,0 @@
-Apache Celeborn (Incubating) is an effort undergoing incubation at the Apache
-Software Foundation (ASF), sponsored by the Apache Incubator PMC.
-
-Incubation is required of all newly accepted projects until a further review
-indicates that the infrastructure, communications, and decision making process
-have stabilized in a manner consistent with other successful ASF projects.
-
-While incubation status is not necessarily a reflection of the completeness
-or stability of the code, it does indicate that the project has yet to be
-fully endorsed by the ASF.
diff --git a/NOTICE b/NOTICE
index 34ec3f608..b63ca7b19 100644
--- a/NOTICE
+++ b/NOTICE
@@ -1,5 +1,5 @@
-Apache Celeborn (Incubating)
+Apache Celeborn
Copyright 2022-2024 The Apache Software Foundation.
This product includes software developed at
diff --git a/NOTICE-binary b/NOTICE-binary
index 8e59fff41..4d6fb0456 100644
--- a/NOTICE-binary
+++ b/NOTICE-binary
@@ -1,5 +1,5 @@
-Apache Celeborn (Incubating)
+Apache Celeborn
Copyright 2022-2024 The Apache Software Foundation.
This product includes software developed at
diff --git a/README.md b/README.md
index 0a9304c48..ca40effad 100644
--- a/README.md
+++ b/README.md
@@ -1,7 +1,7 @@
-# Apache Celeborn (Incubating)
+# Apache Celeborn
-[](https://github.com/apache/incubator-celeborn/actions/workflows/maven.yml)
-Celeborn is dedicated to improving the efficiency and elasticity of
+[](https://github.com/apache/celeborn/actions/workflows/maven.yml)
+Celeborn (/ˈkeləbɔ:n/) is dedicated to improving the efficiency and elasticity
of
different map-reduce engines and provides an elastic, high-efficient
management service for intermediate data including shuffle data, spilled data,
result data, etc. Currently, Celeborn is focusing on shuffle data.
@@ -44,12 +44,12 @@ Celeborn worker's slot count decreases when a partition is
allocated and increme
1. Celeborn supports Spark 2.4/3.0/3.1/3.2/3.3/3.4/3.5, Flink
1.14/1.15/1.17/1.18 and Hadoop MapReduce 2/3.
2. Celeborn tested under Scala 2.11/2.12/2.13 and Java 8/11/17 environment.
-Build Celeborn
+Build Celeborn via `make-distribution.sh`:
```shell
./build/make-distribution.sh
-Pspark-2.4/-Pspark-3.0/-Pspark-3.1/-Pspark-3.2/-Pspark-3.3/-Pspark-3.4/-Pflink-1.14/-Pflink-1.15/-Pflink-1.17/-Pflink-1.18/-Pmr
```
-package apache-celeborn-${project.version}-bin.tgz will be generated.
+Package `apache-celeborn-${project.version}-bin.tgz` will be generated.
> **_NOTE:_** The following table indicates the compatibility of Celeborn
> Spark and Flink clients with different versions of Spark and Flink for
> various Java and Scala versions.
@@ -67,7 +67,7 @@ package apache-celeborn-${project.version}-bin.tgz will be
generated.
| Flink 1.17 | ❌ | ✔ | ✔ |
❌ | ❌ | ❌ | ❌
|
| Flink 1.18 | ❌ | ✔ | ✔ |
❌ | ❌ | ❌ | ❌
|
-To compile the client for Spark 2.4 with Scala 2.12, please use the following
command
+To compile the client for Spark 2.4 with Scala 2.12, please use the following
command:
- Scala 2.12.8/2.12.9/2.12.10
```shell
@@ -107,8 +107,8 @@ Celeborn cluster composes of Master and Worker nodes, the
Master supports both s
### Deploy Celeborn
#### Deploy on host
-1. Unzip the tarball to `$CELEBORN_HOME`
-2. Modify environment variables in `$CELEBORN_HOME/conf/celeborn-env.sh`
+1. Unzip the tarball to `$CELEBORN_HOME`.
+2. Modify environment variables in `$CELEBORN_HOME/conf/celeborn-env.sh`.
EXAMPLE:
```properties
@@ -117,7 +117,7 @@ CELEBORN_MASTER_MEMORY=4g
CELEBORN_WORKER_MEMORY=2g
CELEBORN_WORKER_OFFHEAP_MEMORY=4g
```
-3. Modify configurations in `$CELEBORN_HOME/conf/celeborn-defaults.conf`
+3. Modify configurations in `$CELEBORN_HOME/conf/celeborn-defaults.conf`.
EXAMPLE: single master cluster
```properties
@@ -151,7 +151,7 @@ celeborn.worker.replicate.fastFail.duration 240s
celeborn.storage.hdfs.kerberos.principal user@REALM
celeborn.storage.hdfs.kerberos.keytab /path/to/user.keytab
-# If your hosts have disk raid or use lvm, set
celeborn.worker.monitor.disk.enabled to false
+# If your hosts have disk raid or use lvm, set
`celeborn.worker.monitor.disk.enabled` to false
celeborn.worker.monitor.disk.enabled false
```
@@ -198,26 +198,24 @@ celeborn.worker.flusher.hdfs.buffer.size 4m
celeborn.storage.hdfs.dir hdfs://<namenode>/celeborn
celeborn.worker.replicate.fastFail.duration 240s
-# If your hosts have disk raid or use lvm, set
celeborn.worker.monitor.disk.enabled to false
+# If your hosts have disk raid or use lvm, set
`celeborn.worker.monitor.disk.enabled` to false
celeborn.worker.monitor.disk.enabled false
```
Flink engine related configurations:
```properties
-# if you are using Celeborn for flink, these settings will be needed
+# If you are using Celeborn for flink, these settings will be needed.
celeborn.worker.directMemoryRatioForReadBuffer 0.4
-celeborn.worker.directMemoryRatioToResume 0.6
-# these setting will affect performance.
+celeborn.worker.directMemoryRatioToResume 0.5
+# These setting will affect performance.
# If there is enough off-heap memory, you can try to increase read buffers.
# Read buffer max memory usage for a data partition is
`taskmanager.memory.segment-size * readBuffersMax`
celeborn.worker.partition.initial.readBuffersMin 512
celeborn.worker.partition.initial.readBuffersMax 1024
celeborn.worker.readBuffer.allocationWait 10ms
-# Currently, shuffle partitionSplit is not supported, so you should disable
split in celeborn worker side or set
`celeborn.client.shuffle.partitionSplit.threshold` to a high value in flink
client side.
-celeborn.worker.shuffle.partitionSplit.enabled false
```
-4. Copy Celeborn and configurations to all nodes
+4. Copy Celeborn and configurations to all nodes.
5. Start all services. If you install Celeborn distribution in the same path
on every node and your
cluster can perform SSH login then you can fill `$CELEBORN_HOME/conf/hosts`
and
use `$CELEBORN_HOME/sbin/start-all.sh` to start all
@@ -250,14 +248,14 @@ WorkerRef: null
Please refer to our
[website](https://celeborn.apache.org/docs/latest/deploy_on_k8s/)
### Deploy Spark client
-Copy $CELEBORN_HOME/spark/*.jar to $SPARK_HOME/jars/
+Copy `$CELEBORN_HOME/spark/*.jar` to `$SPARK_HOME/jars/`.
#### Spark Configuration
-To use Celeborn,the following spark configurations should be added.
+To use Celeborn, the following spark configurations should be added.
```properties
# Shuffle manager class name changed in 0.3.0:
-# before 0.3.0: org.apache.spark.shuffle.celeborn.RssShuffleManager
-# since 0.3.0: org.apache.spark.shuffle.celeborn.SparkShuffleManager
+# before 0.3.0: `org.apache.spark.shuffle.celeborn.RssShuffleManager`
+# since 0.3.0: `org.apache.spark.shuffle.celeborn.SparkShuffleManager`
spark.shuffle.manager org.apache.spark.shuffle.celeborn.SparkShuffleManager
# must use kryo serializer because java serializer do not support relocation
spark.serializer org.apache.spark.serializer.KryoSerializer
@@ -272,13 +270,13 @@ spark.shuffle.service.enabled false
# Sort shuffle writer uses less memory than hash shuffle writer, if your
shuffle partition count is large, try to use sort hash writer.
spark.celeborn.client.spark.shuffle.writer hash
-# We recommend setting spark.celeborn.client.push.replicate.enabled to true to
enable server-side data replication
+# We recommend setting `spark.celeborn.client.push.replicate.enabled` to true
to enable server-side data replication
# If you have only one worker, this setting must be false
# If your Celeborn is using HDFS, it's recommended to set this setting to false
spark.celeborn.client.push.replicate.enabled true
# Support for Spark AQE only tested under Spark 3
-# we recommend setting localShuffleReader to false to get better performance
of Celeborn
+# we recommend setting localShuffleReader to false for getting better
performance of Celeborn
spark.sql.adaptive.localShuffleReader.enabled false
# If Celeborn is using HDFS
@@ -296,7 +294,7 @@ spark.dynamicAllocation.shuffleTracking.enabled false
```
### Deploy Flink client
-Copy $CELEBORN_HOME/flink/*.jar to $FLINK_HOME/lib/
+Copy `$CELEBORN_HOME/flink/*.jar` to `$FLINK_HOME/lib/`.
#### Flink Configuration
To use Celeborn, the following flink configurations should be added.
@@ -322,9 +320,9 @@ taskmanager.memory.task.off-heap.size: 512m
```
**Note**: The config option `execution.batch-shuffle-mode` should configure as
`ALL_EXCHANGES_BLOCKING`.
-### Deploy mapreduce client
-Add $CELEBORN_HOME/mr/*.jar to to `mapreduce.application.classpath` and
`yarn.application.classpath`.
-And setting the following settings in YARN and MapReduce config.
+### Deploy MapReduce client
+Copy `$CELEBORN_HOME/mr/*.jar` into `mapreduce.application.classpath` and
`yarn.application.classpath`.
+Meanwhile, configure the following settings in YARN and MapReduce config.
```bash
-Dyarn.app.mapreduce.am.job.recovery.enable=false
-Dmapreduce.job.reduce.slowstart.completedmaps=1
@@ -334,7 +332,6 @@ And setting the following settings in YARN and MapReduce
config.
-Dmapreduce.job.reduce.shuffle.consumer.plugin.class=org.apache.hadoop.mapreduce.task.reduce.CelebornShuffleConsumer
```
-
### Best Practice
If you want to set up a production-ready Celeborn cluster, your cluster should
have at least 3 masters and at least 4 workers.
Masters and works can be deployed on the same node but should not deploy
multiple masters or workers on the same node.
@@ -371,7 +368,7 @@ Contact us through the following mailing list.
### Report Issues or Submit Pull Request
-If you meet any questions, feel free to file a 🔗[Jira
Ticket](https://issues.apache.org/jira/projects/CELEBORN/issues) or connect us
and fix it by submitting a 🔗[Pull
Request](https://github.com/apache/incubator-celeborn/pulls).
+If you meet any questions, feel free to file a 🔗[Jira
Ticket](https://issues.apache.org/jira/projects/CELEBORN/issues) or connect us
and fix it by submitting a 🔗[Pull
Request](https://github.com/apache/celeborn/pulls).
| IM | Contact Info
|
|:---------|:------------------------------------------------------------------------------------------------------------------------------------------|
diff --git a/build/make-distribution.sh b/build/make-distribution.sh
index 5e9b983dc..afd787fc7 100755
--- a/build/make-distribution.sh
+++ b/build/make-distribution.sh
@@ -390,7 +390,6 @@ cp "$PROJECT_DIR/docker/Dockerfile" "$DIST_DIR/docker"
cp -r "$PROJECT_DIR/charts" "$DIST_DIR"
# Copy license files
-cp "$PROJECT_DIR/DISCLAIMER" "$DIST_DIR/DISCLAIMER"
if [[ -f $"$PROJECT_DIR/LICENSE-binary" ]]; then
cp "$PROJECT_DIR/LICENSE-binary" "$DIST_DIR/LICENSE"
cp -r "$PROJECT_DIR/licenses-binary" "$DIST_DIR/licenses"
diff --git a/build/release/release.sh b/build/release/release.sh
index 48bb5093c..6ed0a96e4 100755
--- a/build/release/release.sh
+++ b/build/release/release.sh
@@ -56,8 +56,8 @@ fi
RELEASE_TAG="v${RELEASE_VERSION}-rc${RELEASE_RC_NO}"
-SVN_STAGING_REPO="https://dist.apache.org/repos/dist/dev/incubator/celeborn"
-SVN_RELEASE_REPO="https://dist.apache.org/repos/dist/release/incubator/celeborn"
+SVN_STAGING_REPO="https://dist.apache.org/repos/dist/dev/celeborn"
+SVN_RELEASE_REPO="https://dist.apache.org/repos/dist/release/celeborn"
RELEASE_DIR="${PROJECT_DIR}/tmp"
SVN_STAGING_DIR="${PROJECT_DIR}/tmp/svn-dev"
diff --git a/client-flink/flink-1.14-shaded/src/main/resources/META-INF/NOTICE
b/client-flink/flink-1.14-shaded/src/main/resources/META-INF/NOTICE
index 63b5024b0..43452a38a 100644
--- a/client-flink/flink-1.14-shaded/src/main/resources/META-INF/NOTICE
+++ b/client-flink/flink-1.14-shaded/src/main/resources/META-INF/NOTICE
@@ -1,5 +1,5 @@
-Apache Celeborn (Incubating)
+Apache Celeborn
Copyright 2022-2024 The Apache Software Foundation.
This product includes software developed at
diff --git a/client-flink/flink-1.15-shaded/src/main/resources/META-INF/NOTICE
b/client-flink/flink-1.15-shaded/src/main/resources/META-INF/NOTICE
index 63b5024b0..43452a38a 100644
--- a/client-flink/flink-1.15-shaded/src/main/resources/META-INF/NOTICE
+++ b/client-flink/flink-1.15-shaded/src/main/resources/META-INF/NOTICE
@@ -1,5 +1,5 @@
-Apache Celeborn (Incubating)
+Apache Celeborn
Copyright 2022-2024 The Apache Software Foundation.
This product includes software developed at
diff --git a/client-flink/flink-1.17-shaded/src/main/resources/META-INF/NOTICE
b/client-flink/flink-1.17-shaded/src/main/resources/META-INF/NOTICE
index 63b5024b0..43452a38a 100644
--- a/client-flink/flink-1.17-shaded/src/main/resources/META-INF/NOTICE
+++ b/client-flink/flink-1.17-shaded/src/main/resources/META-INF/NOTICE
@@ -1,5 +1,5 @@
-Apache Celeborn (Incubating)
+Apache Celeborn
Copyright 2022-2024 The Apache Software Foundation.
This product includes software developed at
diff --git a/client-flink/flink-1.18-shaded/src/main/resources/META-INF/NOTICE
b/client-flink/flink-1.18-shaded/src/main/resources/META-INF/NOTICE
index 63b5024b0..43452a38a 100644
--- a/client-flink/flink-1.18-shaded/src/main/resources/META-INF/NOTICE
+++ b/client-flink/flink-1.18-shaded/src/main/resources/META-INF/NOTICE
@@ -1,5 +1,5 @@
-Apache Celeborn (Incubating)
+Apache Celeborn
Copyright 2022-2024 The Apache Software Foundation.
This product includes software developed at
diff --git a/client-mr/mr-shaded/src/main/resources/META-INF/NOTICE
b/client-mr/mr-shaded/src/main/resources/META-INF/NOTICE
index 5b5319639..9a5437b44 100644
--- a/client-mr/mr-shaded/src/main/resources/META-INF/NOTICE
+++ b/client-mr/mr-shaded/src/main/resources/META-INF/NOTICE
@@ -1,5 +1,5 @@
-Apache Celeborn (Incubating)
+Apache Celeborn
Copyright 2022-2024 The Apache Software Foundation.
This product includes software developed at
diff --git a/client-spark/spark-2-shaded/src/main/resources/META-INF/NOTICE
b/client-spark/spark-2-shaded/src/main/resources/META-INF/NOTICE
index 1fd47fe3d..c48952d00 100644
--- a/client-spark/spark-2-shaded/src/main/resources/META-INF/NOTICE
+++ b/client-spark/spark-2-shaded/src/main/resources/META-INF/NOTICE
@@ -1,5 +1,5 @@
-Apache Celeborn (Incubating)
+Apache Celeborn
Copyright 2022-2024 The Apache Software Foundation.
This product includes software developed at
diff --git a/client-spark/spark-3-shaded/src/main/resources/META-INF/NOTICE
b/client-spark/spark-3-shaded/src/main/resources/META-INF/NOTICE
index 1fd47fe3d..c48952d00 100644
--- a/client-spark/spark-3-shaded/src/main/resources/META-INF/NOTICE
+++ b/client-spark/spark-3-shaded/src/main/resources/META-INF/NOTICE
@@ -1,5 +1,5 @@
-Apache Celeborn (Incubating)
+Apache Celeborn
Copyright 2022-2024 The Apache Software Foundation.
This product includes software developed at
diff --git a/dev/merge_pr.py b/dev/merge_pr.py
index f46370d59..4794e62aa 100755
--- a/dev/merge_pr.py
+++ b/dev/merge_pr.py
@@ -64,8 +64,8 @@ JIRA_ACCESS_TOKEN = os.environ.get("JIRA_ACCESS_TOKEN")
GITHUB_OAUTH_KEY = os.environ.get("GITHUB_OAUTH_KEY")
-GITHUB_BASE = "https://github.com/apache/incubator-celeborn/pull"
-GITHUB_API_BASE = "https://api.github.com/repos/apache/incubator-celeborn"
+GITHUB_BASE = "https://github.com/apache/celeborn/pull"
+GITHUB_API_BASE = "https://api.github.com/repos/apache/celeborn"
JIRA_BASE = "https://issues.apache.org/jira/browse"
JIRA_API_BASE = "https://issues.apache.org/jira"
# Prefix added to temporary branches
diff --git a/docs/README.md b/docs/README.md
index 2187835ca..125b6f4a8 100644
--- a/docs/README.md
+++ b/docs/README.md
@@ -20,11 +20,11 @@ license: |
---
Quick Start
===
-This documentation gives a quick start guide for running Apache Spark/Flink
with Apache Celeborn™(Incubating).
+This documentation gives a quick start guide for running Spark/Flink/MapReduce
with Apache Celeborn™.
### Download Celeborn
Download the latest Celeborn binary from the [Downloading
Page](https://celeborn.apache.org/download/).
-Decompress the binary and set `$CELEBORN_HOME`
+Decompress the binary and set `$CELEBORN_HOME`.
```shell
tar -C <DST_DIR> -zxvf apache-celeborn-<VERSION>-bin.tgz
export CELEBORN_HOME=<Decompressed path>
@@ -37,7 +37,7 @@ cd $CELEBORN_HOME/conf
cp log4j2.xml.template log4j2.xml
```
#### Configure Storage
-Configure the directory to store shuffle data, for example
`$CELEBORN_HOME/shuffle`
+Configure the directory to store shuffle data, for example
`$CELEBORN_HOME/shuffle`.
```shell
cd $CELEBORN_HOME/conf
echo "celeborn.worker.storage.dirs=$CELEBORN_HOME/shuffle" >
celeborn-defaults.conf
@@ -154,11 +154,15 @@ INFO [async-reply] Controller: CommitFiles for
local-1690000152711-0 success wit
```
## Start MapReduce With Celeborn
-### Add Celeborn client jar to MapReduce's classpath
-1.Add $CELEBORN_HOME/mr/*.jar to `mapreduce.application.classpath` and
`yarn.application.classpath`.
-2.Restart your yarn cluster.
-### Add Celeborn configurations to MapReduce's conf
-Modify `${HADOOP_CONF_DIR}/yarn-site.xml`
+### Copy Celeborn Client to MapReduce's classpath
+1. Copy `$CELEBORN_HOME/mr/*.jar` into `mapreduce.application.classpath` and
`yarn.application.classpath`.
+```shell
+cp $CELEBORN_HOME/mr/<Celeborn Client Jar> <mapreduce.application.classpath>
+cp $CELEBORN_HOME/mr/<Celeborn Client Jar> <yarn.application.classpath>
+```
+2. Restart your yarn cluster.
+### Add Celeborn configuration to MapReduce's conf
+- Modify configurations in `${HADOOP_CONF_DIR}/yarn-site.xml`.
```xml
<configuration>
<property>
@@ -173,7 +177,7 @@ Modify `${HADOOP_CONF_DIR}/yarn-site.xml`
</property>
</configuration>
```
-Modify `${HADOOP_CONF_DIR}/mapred-site.xml`
+- Modify configurations in `${HADOOP_CONF_DIR}/mapred-site.xml`.
```xml
<configuration>
<property>
@@ -195,10 +199,11 @@ Modify `${HADOOP_CONF_DIR}/mapred-site.xml`
</property>
</configuration>
```
-Then you can run a word count to check whether your configs are correct.
+Then deploy the example word count to the running cluster for verifying
whether above configurations are correct.
```shell
cd $HADOOP_HOME
-hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.2.1.jar
wordcount /sometext /someoutput
+
+./bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.2.1.jar
wordcount /someinput /someoutput
```
During the MapReduce Job, you should see the following message in Celeborn
Master's log:
```log
diff --git a/docs/developers/glutensupport.md b/docs/developers/glutensupport.md
index 3879e353f..6092caec9 100644
--- a/docs/developers/glutensupport.md
+++ b/docs/developers/glutensupport.md
@@ -19,9 +19,9 @@ license: |
# Gluten Support
## Velox Backend
-[Gluten](https://github.com/oap-project/gluten) with velox backend supports
Celeborn as remote shuffle service. Below introduction is used to enable this
feature
+[Gluten](https://github.com/apache/incubator-gluten) with velox backend
supports Celeborn as remote shuffle service. Below introduction is used to
enable this feature.
-First refer to this
URL(https://github.com/oap-project/gluten/blob/main/docs/get-started/Velox.md)
to build Gluten with velox backend.
+First refer to [Get Started With
Velox](https://github.com/apache/incubator-gluten/blob/main/docs/get-started/Velox.md)
to build Gluten with velox backend.
When compiling the Gluten Java module, it's required to enable `rss` profile,
as follows:
@@ -31,10 +31,10 @@ mvn clean package -Pbackends-velox -Pspark-3.3 -Prss
-DskipTests
Then add the Gluten and Spark Celeborn Client packages to your Spark
application's classpath(usually add them into `$SPARK_HOME/jars`).
-- Celeborn: celeborn-client-spark-3-shaded_2.12-0.3.0-incubating.jar
-- Gluten: gluten-velox-bundle-spark3.x_2.12-xx-xx-SNAPSHOT.jar,
gluten-thirdparty-lib-xx.jar
+- Celeborn: `celeborn-client-spark-3-shaded_2.12-[celebornVersion].jar`
+- Gluten: `gluten-velox-bundle-spark3.x_2.12-xx-xx-SNAPSHOT.jar`,
`gluten-thirdparty-lib-xx.jar`
-Currently to use Gluten following configurations are required in
`spark-defaults.conf`
+Currently, to use Gluten following configurations are required in
`spark-defaults.conf`.
```
spark.shuffle.manager
org.apache.spark.shuffle.gluten.celeborn.CelebornShuffleManager
@@ -42,7 +42,7 @@ spark.shuffle.manager
org.apache.spark.shuffle.gluten.celeborn.CelebornShuffleMa
# celeborn master
spark.celeborn.master.endpoints clb-master:9097
-# we recommend set spark.celeborn.push.replicate.enabled to true to enable
server-side data replication
+# we recommend set `spark.celeborn.push.replicate.enabled` to true to enable
server-side data replication
# If you have only one worker, this setting must be false
spark.celeborn.client.push.replicate.enabled true
@@ -52,7 +52,7 @@ spark.shuffle.service.enabled false
spark.sql.adaptive.localShuffleReader.enabled false
# If you want to use dynamic resource allocation,
-# please refer to this URL
(https://github.com/apache/incubator-celeborn/tree/main/assets/spark-patch) to
apply the patch into your own Spark.
+# please refer to this URL
(https://github.com/apache/celeborn/tree/main/assets/spark-patch) to apply the
patch into your own Spark.
spark.dynamicAllocation.enabled false
```
diff --git a/docs/developers/overview.md b/docs/developers/overview.md
index 9617fa989..a57948949 100644
--- a/docs/developers/overview.md
+++ b/docs/developers/overview.md
@@ -18,7 +18,7 @@ license: |
# Celeborn Architecture
-This article introduces high level Apache Celeborn™(Incubating) Architecture.
For more detailed description of each module/process,
+This article introduces high level Apache Celeborn™ Architecture. For more
detailed description of each module/process,
please refer to dedicated articles.
## Why Celeborn
@@ -30,13 +30,13 @@ the disk and network inefficiency (M * N between Mappers
and Reducers) in tradit
Besides inefficiency, traditional shuffle framework requires large local
storage in compute node to store shuffle
data, thus blocks the adoption of disaggregated architecture.
-Apache Celeborn(Incubating) solves the problems by reorganizing shuffle data
in a more efficient way, and storing the data in
+Apache Celeborn solves the problems by reorganizing shuffle data in a more
efficient way, and storing the data in
a separate service. The high level architecture of Celeborn is as follows:

## Components
-Celeborn(Incubating) has three primary components: Master, Worker, and Client.
+Celeborn has three primary components: Master, Worker, and Client.
- Master manages Celeborn cluster and achieves high availability(HA) based on
Raft.
- Worker processes read-write requests.
diff --git a/mkdocs.yml b/mkdocs.yml
index 63179b9a2..8dfa08388 100644
--- a/mkdocs.yml
+++ b/mkdocs.yml
@@ -15,9 +15,9 @@
# limitations under the License.
#
-site_name: Apache Celeborn™ (Incubating)
-repo_name: apache/incubator-celeborn
-repo_url: https://gitbox.apache.org/repos/asf/incubator-celeborn.git
+site_name: Apache Celeborn™
+repo_name: apache/celeborn
+repo_url: https://gitbox.apache.org/repos/asf/celeborn.git
plugins:
- search
@@ -53,21 +53,9 @@ extra:
- icon: fontawesome/brands/github
copyright: >
- <img src="https://www.apachecon.com/event-images/acna2022-wide-dark.png"
style="height: 100px;" alt="ApacheCon North America">
- <img
src="https://incubator.apache.org/images/incubator_feather_egg_logo_bw_crop.png"
style="height: 87px; margin: 0 0 10px 0;">
<br>
- Copyright © 2022 The Apache Software Foundation
+ Copyright © 2024 The Apache Software Foundation, Licensed under the Apache
License, Version 2.0.
<a
href="https://privacy.apache.org/policies/privacy-policy-public.html">Privacy
Policy<a/><br>
- <br>
- Apache Celeborn™, Apache Incubator, Apache, the Apache feather logo, and the
Apache Incubator project logo are
- trademarks or registered trademarks of The Apache Software Foundation.<br>
- <br>
- Apache Celeborn™ is an effort undergoing incubation at The Apache Software
Foundation (ASF), sponsored by the
- Apache Incubator. Incubation is required of all newly accepted projects
until a further review indicates that
- the infrastructure, communications, and decision making process have
stabilized in a manner consistent with
- other successful ASF projects. While incubation status is not necessarily a
reflection of the completeness or
- stability of the code, it does indicate that the project has yet to be fully
endorsed by the ASF.<br>
- <br>
Please visit <a href="https://www.apache.org/">Apache Software
Foundation</a> for more details.<br>
<br>
diff --git a/project/CelebornBuild.scala b/project/CelebornBuild.scala
index 0a4575b67..80b84b13b 100644
--- a/project/CelebornBuild.scala
+++ b/project/CelebornBuild.scala
@@ -255,8 +255,8 @@ object CelebornCommonSettings {
pomExtra :=
<url>https://celeborn.apache.org/</url>
<scm>
- <url>[email protected]:apache/incubator-celeborn.git</url>
-
<connection>scm:git:[email protected]:apache/incubator-celeborn.git</connection>
+ <url>[email protected]:apache/celeborn.git</url>
+ <connection>scm:git:[email protected]:apache/celeborn.git</connection>
</scm>
)