[ 
https://issues.apache.org/jira/browse/SAMZA-70?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13898412#comment-13898412
 ] 

Garry Turkington commented on SAMZA-70:
---------------------------------------

Is this a dupe of SAMZA-42?

> Create setup class to handle per-job startup setup
> --------------------------------------------------
>
>                 Key: SAMZA-70
>                 URL: https://issues.apache.org/jira/browse/SAMZA-70
>             Project: Samza
>          Issue Type: Bug
>          Components: container
>    Affects Versions: 0.6.0
>            Reporter: Chris Riccomini
>
> There is some Samza job setup that happens before tasks can be run. This 
> includes setting up the checkpoint and state management (change log) 
> factories. For example, we want to verify that the change log and checkpoint 
> topics exist, and if not, create them with the proper number of partitions.
> We should pull this logic into a SetupJob class, and move the execution into 
> a new YarnAppMasterListener called SamzaAppMasterSetup, which should do the 
> job setup during the init() call. In addition, we should execute the same 
> SetupJob class logic in the ProcessJob.submit and ThreadJob.submit methods, 
> as well.
> The motivation for this is threefold:
> 1. There is a race condition in the TaskRunner when multiple containers for a 
> single job are running in YARN, where each TaskRunner is trying to create the 
> checkpoint/change log topics when they don't exist.
> 2. It makes implementing the TaskRunner logic in other languages easier, 
> since non-Java TaskRunner implementations won't have to set the topics up. 
> The SetupClass will be handled in the AM (under YARN) or in the Java code of 
> the ProcessJob/ThreadJob (under local job).
> 3. It gives us a place to run a single chunk of code in controlled, single 
> threaded way, before any of the TaskRunners start.
> Some things to consider: is it OK to just hard-code that the SetupJob class 
> should always just setup the checkpoint manager and change log topics? Do we 
> need to add a setup() method to the lifecycle for everything?



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to