This is an automated email from the ASF dual-hosted git repository.
ethanfeng pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/incubator-celeborn.git
The following commit(s) were added to refs/heads/main by this push:
new cb9adfc51 [CELEBORN-974] Add quick start guide about using MapReduce
with Celeborn
cb9adfc51 is described below
commit cb9adfc5114fcd3d692423586eb68c16d4123dba
Author: mingji <[email protected]>
AuthorDate: Thu Sep 14 19:31:01 2023 +0800
[CELEBORN-974] Add quick start guide about using MapReduce with Celeborn
### What changes were proposed in this pull request?
Add quick start guide about using MapReduce with Celeborn.
### Why are the changes needed?
Celeborn supports MapReduce client recently.
### Does this PR introduce _any_ user-facing change?
NO.
### How was this patch tested?
No need to test.
Closes #1908 from FMX/CELEBORN-974.
Authored-by: mingji <[email protected]>
Signed-off-by: mingji <[email protected]>
---
docs/README.md | 58 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 58 insertions(+)
diff --git a/docs/README.md b/docs/README.md
index e49b89f25..cbd3b8e57 100644
--- a/docs/README.md
+++ b/docs/README.md
@@ -148,4 +148,62 @@ And the following message in Celeborn Worker's log:
INFO [dispatcher-event-loop-4] Controller: Reserved 1 primary location and 0
replica location for local-1690000152711-0
INFO [dispatcher-event-loop-3] Controller: Start commitFiles for
local-1690000152711-0
INFO [async-reply] Controller: CommitFiles for local-1690000152711-0 success
with 1 committed primary partitions, 0 empty primary partitions, 0 failed
primary partitions, 0 committed replica partitions, 0 empty replica partitions,
0 failed replica partitions.
+```
+
+## Start MapReduce With Celeborn
+### Add Celeborn client jar to MapReduce's classpath
+1.Add $CELEBORN_HOME/mr/*.jar to `mapreduce.application.classpath` and
`yarn.application.classpath`.
+2.Restart your yarn cluster.
+### Add Celeborn configurations to MapReduce's conf
+Modify `${HADOOP_CONF_DIR}/yarn-site.xml`
+```xml
+<configuration>
+ <property>
+ <name>yarn.app.mapreduce.am.job.recovery.enable</name>
+ <value>false</value>
+ </property>
+
+ <property>
+ <name>yarn.app.mapreduce.am.command-opts</name>
+ <!-- Append
'org.apache.celeborn.mapreduce.v2.app.MRAppMasterWithCeleborn' to this setting
-->
+
<value>org.apache.celeborn.mapreduce.v2.app.MRAppMasterWithCeleborn</value>
+ </property>
+</configuration>
+```
+Modify `${HADOOP_CONF_DIR}/mapred-site.xml`
+```xml
+<configuration>
+ <property>
+ <name>mapreduce.job.reduce.slowstart.completedmaps</name>
+ <value>1</value>
+ </property>
+ <property>
+ <name>mapreduce.celeborn.master.endpoints</name>
+ <!-- Replace placeholder to the real master address -->
+ <value>placeholder</value>
+ </property>
+ <property>
+ <name>mapreduce.job.map.output.collector.class</name>
+ <value>org.apache.hadoop.mapred.CelebornMapOutputCollector</value>
+ </property>
+ <property>
+ <name>mapreduce.job.reduce.shuffle.consumer.plugin.class</name>
+
<value>org.apache.hadoop.mapreduce.task.reduce.CelebornShuffleConsumer</value>
+ </property>
+</configuration>
+```
+Then you can run a word count to check whether your configs are correct.
+```shell
+cd $HADOOP_HOME
+hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.2.1.jar
wordcount /sometext /someoutput
+```
+During the MapReduce Job, you should see the following message in Celeborn
Master's log:
+```log
+Master: Offer slots successfully for 1 reducers of
application_1694674023293_0003-0 on 1 workers.
+```
+And the following message in Celeborn Worker's log:
+```log
+INFO [dispatcher-event-loop-4] Controller: Reserved 1 primary location and 0
replica location for application_1694674023293_0003-0
+INFO [dispatcher-event-loop-3] Controller: Start commitFiles for
application_1694674023293_0003-0
+INFO [async-reply] Controller: CommitFiles for
application_1694674023293_0003-0 success with 1 committed primary partitions, 0
empty primary partitions, 0 failed primary partitions, 0 committed replica
partitions, 0 empty replica partitions, 0 failed replica partitions.
```
\ No newline at end of file