Repository: incubator-samza
Updated Branches:
  refs/heads/master 048ffd2fe -> c932c5029


SAMZA-181; add a tutorial to show how to run samza jobs from HDFS.


Project: http://git-wip-us.apache.org/repos/asf/incubator-samza/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-samza/commit/c932c502
Tree: http://git-wip-us.apache.org/repos/asf/incubator-samza/tree/c932c502
Diff: http://git-wip-us.apache.org/repos/asf/incubator-samza/diff/c932c502

Branch: refs/heads/master
Commit: c932c50298bdd8cd6022286d364aa6daaaaa5667
Parents: 048ffd2
Author: Yan Fang <[email protected]>
Authored: Thu Mar 13 12:48:35 2014 -0700
Committer: Chris Riccomini <[email protected]>
Committed: Thu Mar 13 12:48:35 2014 -0700

----------------------------------------------------------------------
 .../0.7.0/deploy-samza-job-from-hdfs.md         | 55 ++++++++++++++++++++
 docs/learn/tutorials/0.7.0/index.md             |  2 +
 2 files changed, 57 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-samza/blob/c932c502/docs/learn/tutorials/0.7.0/deploy-samza-job-from-hdfs.md
----------------------------------------------------------------------
diff --git a/docs/learn/tutorials/0.7.0/deploy-samza-job-from-hdfs.md 
b/docs/learn/tutorials/0.7.0/deploy-samza-job-from-hdfs.md
new file mode 100644
index 0000000..ca5f599
--- /dev/null
+++ b/docs/learn/tutorials/0.7.0/deploy-samza-job-from-hdfs.md
@@ -0,0 +1,55 @@
+---
+layout: page
+title: Deploying a Samza job from HDFS
+---
+
+This tutorial uses [hello-samza](../../../startup/hello-samza/0.7.0/) to 
illustrate how to run a Samza job if you want to publish the Samza job's 
.tar.gz package to HDFS.
+
+### Build a new Samza job package
+
+Build a new Samza job package to include the hadoop-hdfs-version.jar.
+
+* Add dependency statement in pom.xml of samza-job-package
+
+```
+<dependency>
+  <groupId>org.apache.hadoop</groupId>
+  <artifactId>hadoop-hdfs</artifactId>
+  <version>2.2.0</version>
+</dependency>
+```
+
+* Add the following code to src/main/assembly/src.xml in samza-job-package.
+
+```
+<include>org.apache.hadoop:hadoop-hdfs</include>
+```
+
+* Create .tar.gz package
+
+```
+mvn clean pacakge
+```
+
+* Make sure hadoop-common-version.jar has the same version as your 
hadoop-hdfs-version.jar. Otherwise, you may still have errors.
+
+### Upload the package
+
+```
+hadoop fs -put ./samza-job-package/target/samza-job-package-0.7.0-dist.tar.gz 
/path/for/tgz
+```
+
+### Add HDFS configuration
+
+Put the hdfs-site.xml file of your cluster into ~/.samza/conf directory. (The 
same place as the yarn-site.xml)
+
+### Change properties file
+
+Change the yarn.package.path in the properties file to your HDFS location.
+
+```
+yarn.package.path=hdfs://<hdfs name node ip>:<hdfs name node port>/path/to/tgz
+```
+
+Then you should be able to run the Samza job as described in 
[hello-samza](../../../startup/hello-samza/0.7.0/).
+   

http://git-wip-us.apache.org/repos/asf/incubator-samza/blob/c932c502/docs/learn/tutorials/0.7.0/index.md
----------------------------------------------------------------------
diff --git a/docs/learn/tutorials/0.7.0/index.md 
b/docs/learn/tutorials/0.7.0/index.md
index 42283ef..e40861e 100644
--- a/docs/learn/tutorials/0.7.0/index.md
+++ b/docs/learn/tutorials/0.7.0/index.md
@@ -5,6 +5,8 @@ title: Tutorials
 
 [Remote Debugging with Samza](remote-debugging-samza.html)
 
+[Deploying a Samza Job from HDFS](deploy-samza-job-from-hdfs.html)
+
 <!-- TODO a bunch of tutorials
 [Log Walkthrough](log-walkthrough.html)
 <a href="configuring-kafka-system.html">Configuring a Kafka System</a><br/>

Reply via email to