[ 
https://issues.apache.org/jira/browse/SAMZA-307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yan Fang updated SAMZA-307:
---------------------------

    Description: 
Currently, we have two ways of deploying the samza job to YARN cluster, from 
[HDFS|https://samza.incubator.apache.org/learn/tutorials/0.7.0/deploy-samza-job-from-hdfs.html]
 and [Http | 
https://samza.incubator.apache.org/learn/tutorials/0.7.0/run-in-multi-node-yarn.html],
 but neither of them is out-of-box. Users have to go through the tutorial, add 
dependencies, recompile, put the job package to HDFS or Http and then finally 
run.

I feel it is a little cumbersome sometimes. We maybe able to provide a simpler 
way to deploy the job.

1. When users have YARN and HDFS in the same cluster (such as CDH5), we can 
provide a job-submit script which takes cluster configuration, then call some 
jave code to upload the assembly (all the samza needed jars and is 
already-compiled) along with user's job jar (which changes frequently) to the 
HDFS, and then run the job as usual. (Yes, I learnt it from [Spark's Yarn 
deploy|http://spark.apache.org/docs/latest/running-on-yarn.html])

2.  
 

 

> Simplify deploy procedure 
> --------------------------
>
>                 Key: SAMZA-307
>                 URL: https://issues.apache.org/jira/browse/SAMZA-307
>             Project: Samza
>          Issue Type: Improvement
>            Reporter: Yan Fang
>
> Currently, we have two ways of deploying the samza job to YARN cluster, from 
> [HDFS|https://samza.incubator.apache.org/learn/tutorials/0.7.0/deploy-samza-job-from-hdfs.html]
>  and [Http | 
> https://samza.incubator.apache.org/learn/tutorials/0.7.0/run-in-multi-node-yarn.html],
>  but neither of them is out-of-box. Users have to go through the tutorial, 
> add dependencies, recompile, put the job package to HDFS or Http and then 
> finally run.
> I feel it is a little cumbersome sometimes. We maybe able to provide a 
> simpler way to deploy the job.
> 1. When users have YARN and HDFS in the same cluster (such as CDH5), we can 
> provide a job-submit script which takes cluster configuration, then call some 
> jave code to upload the assembly (all the samza needed jars and is 
> already-compiled) along with user's job jar (which changes frequently) to the 
> HDFS, and then run the job as usual. (Yes, I learnt it from [Spark's Yarn 
> deploy|http://spark.apache.org/docs/latest/running-on-yarn.html])
> 2.  
>  
>  



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to