Repository: tez
Updated Branches:
  refs/heads/branch-0.5.0 7e766d296 -> 65eb31893


TEZ-1464. Pull INSTALL.txt fix and update for 0.5.0


Project: http://git-wip-us.apache.org/repos/asf/tez/repo
Commit: http://git-wip-us.apache.org/repos/asf/tez/commit/65eb3189
Tree: http://git-wip-us.apache.org/repos/asf/tez/tree/65eb3189
Diff: http://git-wip-us.apache.org/repos/asf/tez/diff/65eb3189

Branch: refs/heads/branch-0.5.0
Commit: 65eb31893c20e16036d187a9d0d8043b844a22e1
Parents: 7e766d2
Author: Bikas Saha <[email protected]>
Authored: Tue Aug 19 18:28:07 2014 -0700
Committer: Bikas Saha <[email protected]>
Committed: Tue Aug 19 18:28:07 2014 -0700

----------------------------------------------------------------------
 INSTALL.txt | 54 ++++++++++++++++++++++++++++++------------------------
 1 file changed, 30 insertions(+), 24 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/tez/blob/65eb3189/INSTALL.txt
----------------------------------------------------------------------
diff --git a/INSTALL.txt b/INSTALL.txt
index cd29fce..17e7413 100644
--- a/INSTALL.txt
+++ b/INSTALL.txt
@@ -2,7 +2,7 @@ How to use TEZ
 =======================
 
 Tez provides an ApplicationMaster that can run any arbritary DAG of tasks. It 
also
-provides a translation layer to run MR or MRR jobs using the MR APIs. This 
translation
+provides a translation layer to run MR jobs using the MR APIs. This translation
 layer is not fully feature compatible so if you do see any issues with running 
your
 existing MR jobs on TEZ, please file jiras.
 
@@ -12,35 +12,28 @@ Install/Deploy Instructions
 1) Deploy Apache Hadoop either using the 2.2.0 release or a compatible 2.x 
version.
 2) Build tez using "mvn clean package -DskipTests=true 
-Dmaven.javadoc.skip=true"
    - If you prefer to run the unit tests, remove skipTests from the command 
above.
-   - A tarball containing the libraries required to run tez will be created at 
tez-dist/target/tez-0.5.0-SNAPSHOT.tar.gz
+   - A tarball containing the libraries required to run tez will be created at 
tez-dist/target/tez-0.5.0.tar.gz
 3) Copy the relevant tez tarball into HDFS, and configure tez-site.xml
-   - A tez tarball containing tez and hadoop libraries will be found at 
tez-dist/target/tez-0.5.0-SNAPSHOT.tar.gz
+   - A tez tarball containing tez and hadoop libraries will be found at 
tez-dist/target/tez-0.5.0.tar.gz
    - Assuming that the tez jars are put in /apps/ on HDFS, the command would be
-     "hadoop dfs -mkdir /apps/tez-0.5.0-SNAPSHOT"
-     "hadoop dfs -copyFromLocal 
tez-dist/target/tez-0.5.0-SNAPSHOT-archive.tar.gz /apps/tez-0.5.0-SNAPSHOT/"
+     "hadoop dfs -mkdir /apps/tez-0.5.0"
+     "hadoop dfs -copyFromLocal tez-dist/target/tez-0.5.0.tar.gz 
/apps/tez-0.5.0/"
    - tez-site.xml configuration 
      - Set tez.lib.uris to point to the tar.gz uploaded to HDFS. Assuming the 
steps mentioned so far were followed,
-       set tez.lib.uris to 
"${fs.default.name}/apps/tez-0.5.0-SNAPSHOT/tez-0.5.0-SNAPSHOT.tar.gz"
+       set tez.lib.uris to "${fs.defaultFS}/apps/tez-0.5.0/tez-0.5.0.tar.gz"
      - Ensure tez.use.cluster.hadoop-libs is not set in tez-site.xml, or if it 
is set, the value should be false
-4) Optional: If running existing MapReduce jobs on Tez. Modify mapred-site.xml 
to change "mapreduce.framework.name" property from its
-   default value of "yarn" to "yarn-tez"
+4) Optional: If running existing MapReduce jobs on Tez. Modify mapred-site.xml 
to change 
+"mapreduce.framework.name" property from its default value of "yarn" to 
"yarn-tez"
 5) Configure the client node to include the tez-libraries in the hadoop 
classpath
    - Extract the tez tarball created in step 2 to a local directory - 
(assuming TEZ_JARS is where the files will be decompressed for the next steps)
-     "tar -xvzf tez-dist/target/tez-0.5.0-SNAPSHOT.tar.gz -C $TEZ_JARS"
+     "tar -xvzf tez-dist/target/tez-0.5.0.tar.gz -C $TEZ_JARS"
    - set HADOOP_CLASSPATH to include the tez-libraries
      - set TEZ_CONF_DIR to the location of tez-site.xml
      - The command to set up the classpath should be something like:
        "export 
HADOOP_CLASSPATH=${TEZ_CONF_DIR}:${TEZ_JARS}/*:${TEZ_JARS}/lib/*"
      - Please note the "*" which is an important requirement when setting up 
classpaths for directories containing jar files.
 
-6) Submit a MR job as you normally would using something like:
-
-$HADOOP_PREFIX/bin/hadoop jar 
hadoop-mapreduce-client-jobclient-2.2.0-tests.jar sleep -mt 1 -rt 1 -m 1 -r 1
-
-This will use the TEZ DAG ApplicationMaster to run the MR job. This can be
-verified by looking at the AM's logs from the YARN ResourceManager UI.
-
-7) There is a basic example of using an MRR job in the tez-examples.jar. Refer 
to OrderedWordCount.java
+6) There is a basic example of a Tez job in the tez-examples.jar. Refer to 
OrderedWordCount.java
 in the source code. To run this example:
 
 $HADOOP_PREFIX/bin/hadoop jar tez-examples.jar orderedwordcount <input> 
<output>
@@ -61,14 +54,27 @@ set -DUSE_TEZ_SESSION=true
 
 $HADOOP_PREFIX/bin/hadoop jar tez-tests.jar testorderedwordcount 
-DUSE_TEZ_SESSION=true <input1> <output1> <input2> <output2>
 
+7) To test MR jobs you can submit an MR job as you normally would using 
something like:
 
-Alternate machanism to setup Tez to use Hadoop libraries from the cluster.
-Step 3 changes as follows. Also subsequent steps would use 
tez-dist/target/tez-0.5.0-SNAPSHOT-minimal.tar.gz instead of 
tez-dist/target/tez-0.5.0-SNAPSHOT.tar.gz
-   - A tez build without Hadoop dependencies will be available at 
tez-dist/target/tez-0.5.0-SNAPSHOT-minimal.tar.gz
+$HADOOP_PREFIX/bin/hadoop jar 
hadoop-mapreduce-client-jobclient-2.2.0-tests.jar sleep -mt 1 -rt 1 -m 1 -r 1
+
+This will use the TEZ DAG ApplicationMaster to run the MR job. This can be 
verified by looking at 
+the AM's logs from the YARN ResourceManager UI. This needs mapred-site.xml to 
have "mapreduce.framework.name" 
+set to "yarn-tez"
+
+
+Hadoop Installation dependent Install/Deploy Instructions
+=========================================================
+The above install instructions use Tez with pre-packaged Hadoop libraries 
included in the package and is the 
+recommended method for installation. If its needed to make Tez use the 
existing cluster Hadoop libraries then
+follow this alternate machanism to setup Tez to use Hadoop libraries from the 
cluster.
+Step 3 above changes as follows. Also subsequent steps would use 
tez-dist/target/tez-0.5.0-minimal.tar.gz instead of 
tez-dist/target/tez-0.5.0.tar.gz
+   - A tez build without Hadoop dependencies will be available at 
tez-dist/target/tez-0.5.0-minimal.tar.gz
    - Assuming that the tez jars are put in /apps/ on HDFS, the command would be
-     "hadoop dfs -mkdir /apps/tez-0.5.0-SNAPSHOT"
-     "hadoop dfs -copyFromLocal 
tez-dist/target/tez-0.5.0-SNAPSHOT-archive-minimal.tar.gz 
/apps/tez-0.5.0-SNAPSHOT"
+     "hadoop dfs -mkdir /apps/tez-0.5.0"
+     "hadoop dfs -copyFromLocal tez-dist/target/tez-0.5.0-minimal.tar.gz 
/apps/tez-0.5.0"
    - tez-site.xml configuration
      - Set tez.lib.uris to point to the paths in HDFS containing the tez jars. 
Assuming the steps mentioned so far were followed,
-     set tez.lib.uris to 
"${fs.default.name}/apps/tez-0.5.0-SNAPSHOT/tez-0.5.0-SNAPSHOT-minimal.tar.gz
-     - Also set tez.use.cluster.hadoop-libs to true
+     set tez.lib.uris to 
"${fs.defaultFS}/apps/tez-0.5.0/tez-0.5.0-minimal.tar.gz
+     - set tez.use.cluster.hadoop-libs to true
+

Reply via email to