This is an automated email from the ASF dual-hosted git repository.

ethanfeng pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/incubator-celeborn.git


The following commit(s) were added to refs/heads/main by this push:
     new cb9adfc51 [CELEBORN-974] Add quick start guide about using MapReduce 
with Celeborn
cb9adfc51 is described below

commit cb9adfc5114fcd3d692423586eb68c16d4123dba
Author: mingji <[email protected]>
AuthorDate: Thu Sep 14 19:31:01 2023 +0800

    [CELEBORN-974] Add quick start guide about using MapReduce with Celeborn
    
    ### What changes were proposed in this pull request?
    Add quick start guide about using MapReduce with Celeborn.
    
    ### Why are the changes needed?
    Celeborn supports MapReduce client recently.
    
    ### Does this PR introduce _any_ user-facing change?
    NO.
    
    ### How was this patch tested?
    No need to test.
    
    Closes #1908 from FMX/CELEBORN-974.
    
    Authored-by: mingji <[email protected]>
    Signed-off-by: mingji <[email protected]>
---
 docs/README.md | 58 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 58 insertions(+)

diff --git a/docs/README.md b/docs/README.md
index e49b89f25..cbd3b8e57 100644
--- a/docs/README.md
+++ b/docs/README.md
@@ -148,4 +148,62 @@ And the following message in Celeborn Worker's log:
 INFO [dispatcher-event-loop-4] Controller: Reserved 1 primary location and 0 
replica location for local-1690000152711-0
 INFO [dispatcher-event-loop-3] Controller: Start commitFiles for 
local-1690000152711-0
 INFO [async-reply] Controller: CommitFiles for local-1690000152711-0 success 
with 1 committed primary partitions, 0 empty primary partitions, 0 failed 
primary partitions, 0 committed replica partitions, 0 empty replica partitions, 
0 failed replica partitions.
+```
+
+## Start MapReduce With Celeborn
+### Add Celeborn client jar to MapReduce's classpath
+1.Add $CELEBORN_HOME/mr/*.jar to `mapreduce.application.classpath` and 
`yarn.application.classpath`.
+2.Restart your yarn cluster.
+### Add Celeborn configurations to MapReduce's conf
+Modify `${HADOOP_CONF_DIR}/yarn-site.xml`
+```xml
+<configuration>
+    <property>
+        <name>yarn.app.mapreduce.am.job.recovery.enable</name>
+        <value>false</value>
+    </property>
+
+    <property>
+        <name>yarn.app.mapreduce.am.command-opts</name>
+        <!-- Append 
'org.apache.celeborn.mapreduce.v2.app.MRAppMasterWithCeleborn' to this setting  
-->
+        
<value>org.apache.celeborn.mapreduce.v2.app.MRAppMasterWithCeleborn</value>
+    </property>
+</configuration>
+```
+Modify `${HADOOP_CONF_DIR}/mapred-site.xml`
+```xml
+<configuration>
+    <property>
+        <name>mapreduce.job.reduce.slowstart.completedmaps</name>
+        <value>1</value>
+    </property>
+    <property>
+        <name>mapreduce.celeborn.master.endpoints</name>
+        <!-- Replace placeholder to the real master address       -->
+        <value>placeholder</value>
+    </property>
+    <property>
+        <name>mapreduce.job.map.output.collector.class</name>
+        <value>org.apache.hadoop.mapred.CelebornMapOutputCollector</value>
+    </property>
+    <property>
+        <name>mapreduce.job.reduce.shuffle.consumer.plugin.class</name>
+        
<value>org.apache.hadoop.mapreduce.task.reduce.CelebornShuffleConsumer</value>
+    </property>
+</configuration>
+```
+Then you can run a word count to check whether your configs are correct.
+```shell
+cd $HADOOP_HOME
+hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.2.1.jar 
wordcount /sometext /someoutput
+```
+During the MapReduce Job, you should see the following message in Celeborn 
Master's log:
+```log
+Master: Offer slots successfully for 1 reducers of 
application_1694674023293_0003-0 on 1 workers.
+```
+And the following message in Celeborn Worker's log:
+```log
+INFO [dispatcher-event-loop-4] Controller: Reserved 1 primary location and 0 
replica location for application_1694674023293_0003-0
+INFO [dispatcher-event-loop-3] Controller: Start commitFiles for 
application_1694674023293_0003-0
+INFO [async-reply] Controller: CommitFiles for 
application_1694674023293_0003-0 success with 1 committed primary partitions, 0 
empty primary partitions, 0 failed primary partitions, 0 committed replica 
partitions, 0 empty replica partitions, 0 failed replica partitions.
 ```
\ No newline at end of file

Reply via email to