This is an automated email from the ASF dual-hosted git repository.

zhouky pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/incubator-celeborn.git


The following commit(s) were added to refs/heads/main by this push:
     new fb2af146b [CELEBORN-822][DOC] Add quick start guide
fb2af146b is described below

commit fb2af146bfc78ef8b07b3db664abb75988f58f51
Author: zky.zhoukeyong <[email protected]>
AuthorDate: Sat Jul 22 21:39:41 2023 +0800

    [CELEBORN-822][DOC] Add quick start guide
    
    ### What changes were proposed in this pull request?
    As title.
    
![image](https://github.com/apache/incubator-celeborn/assets/948245/e2e96131-26be-497f-9f11-e8b5e215a15d)
    
    ### Why are the changes needed?
    As title.
    
    ### Does this PR introduce _any_ user-facing change?
    No.
    
    ### How was this patch tested?
    No.
    
    Closes #1745 from waitinfuture/822.
    
    Lead-authored-by: zky.zhoukeyong <[email protected]>
    Co-authored-by: Keyong Zhou <[email protected]>
    Signed-off-by: zky.zhoukeyong <[email protected]>
---
 docs/README.md | 97 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 94 insertions(+), 3 deletions(-)

diff --git a/docs/README.md b/docs/README.md
index b1493b79e..4e7810017 100644
--- a/docs/README.md
+++ b/docs/README.md
@@ -1,7 +1,6 @@
 ---
 hide:
   - navigation
-  - toc
 
 license: |
   Licensed to the Apache Software Foundation (ASF) under one or more
@@ -17,6 +16,98 @@ license: |
   See the License for the specific language governing permissions and
   limitations under the License.
 ---
+Quick Start
+===
+This documentation gives a quick start guide for running Apache Spark with 
Apache Celeborn(Incubating).
 
-Apache Celeborn (Incubating)
-===
\ No newline at end of file
+### Download Celeborn
+Download the latest Celeborn binary from the [Downloading 
Page](https://celeborn.apache.org/download/).
+Decompress the binary and set `$CELEBORN_HOME`
+```shell
+tar -C <DST_DIR> -zxvf apache-celeborn-<VERSION>-bin.tgz
+export $CELEBORN_HOME=${Decompressed path}
+```
+
+## Configure Logging and Storage
+#### Configure Logging
+```shell
+cd $CELEBORN_HOME/conf
+cp log4j2.xml.template log4j2.xml
+```
+#### Configure Storage
+Configure the directory to store shuffle data, for example 
`$CELEBORN_HOME/shuffle`
+```shell
+cd $CELEBORN_HOME/conf
+echo "celeborn.worker.storage.dirs=$CELEBORN_HOME/shuffle" > 
celeborn-defaults.conf
+```
+
+## Start Celeborn Service
+#### Start Master
+```shell
+cd $CELEBORN_HOME
+./sbin/start-master.sh
+```
+You should see `Master`'s ip:port in the log:
+```shell
+INFO [main] NettyRpcEnvFactory: Starting RPC Server [MasterSys] on 
192.168.2.109:9097
+```
+#### Start Worker
+Use the Master's IP and Port to start Worker:
+```shell
+cd $CELEBORN_HOME
+./sbin/start-worker.sh celeborn://${Master IP}:${Master Port}
+```
+You should see the following message in Worker's log:
+```shell
+23/07/22 11:39:23,546 INFO [main] MasterClient: connect to master 
192.168.2.109:9097.
+23/07/22 11:39:23,673 INFO [main] Worker: Register worker successfully.
+```
+And also the following message in Master's log:
+```shell
+23/07/22 11:39:23,650 INFO [dispatcher-event-loop-9] Master: Registered worker
+Host: 192.168.2.109
+RpcPort: 57806
+PushPort: 57807
+FetchPort: 57809
+ReplicatePort: 57808
+SlotsUsed: 0
+LastHeartbeat: 0
+HeartbeatElapsedSeconds: xxx
+Disks:
+  DiskInfo0: xxx
+UserResourceConsumption: empty
+WorkerRef: null
+```
+
+## Start Spark with Celeborn
+#### Copy Celeborn Client to Spark's jars
+Celeborn release binary contains clients for Spark 2.x and Spark 3.x, copy the 
corresponding client jar into Spark's
+`jars/` directory:
+```shell
+cp $CELEBORN_HOME/spark/<Celeborn Client Jar> $SPARK_HOME/jars/
+```
+#### Start spark-shell
+Set `spark.shuffle.manager` to Celeborn's ShuffleManager, and turn off 
`spark.shuffle.service.enabled`:
+```shell
+cd $SPARK_HOME
+
+./bin/spark-shell \
+--conf 
spark.shuffle.manager=org.apache.spark.shuffle.celeborn.SparkShuffleManager \
+--conf spark.shuffle.service.enabled=false
+```
+Then run the following test case:
+```shell
+spark.sparkContext.parallelize(1 to 10, 10)
+  .flatMap( _ => (1 to 100).iterator
+  .map(num => num)).repartition(10).count
+```
+During the Spark Job, you should see the following message in Celeborn 
Master's log:
+```shell
+Master: Offer slots successfully for 10 reducers of local-1690000152711-0 on 1 
workers.
+```
+And the following message in Celeborn Worker's log:
+```shell
+23/07/22 12:29:57,952 INFO [dispatcher-event-loop-9] Controller: Reserved 10 
primary location and 0 replica location for local-1690000152711-0
+23/07/22 12:29:58,117 INFO [dispatcher-event-loop-10] Controller: Start 
commitFiles for local-1690000152711-0
+23/07/22 12:29:58,153 INFO [async-reply] Controller: CommitFiles for 
local-1690000152711-0 success with 10 committed primary partitions, 0 empty 
primary partitions, 0 failed primary partitions, 0 committed replica 
partitions, 0 empty replica partitions, 0 failed replica partitions.
+```
\ No newline at end of file

Reply via email to