Chris Riccomini created SAMZA-42:
------------------------------------

             Summary: Add a job setup phase to Samza
                 Key: SAMZA-42
                 URL: https://issues.apache.org/jira/browse/SAMZA-42
             Project: Samza
          Issue Type: Bug
          Components: container
    Affects Versions: 0.6.0
            Reporter: Chris Riccomini


We have several use cases for doing things once at the beginning of a Samza 
job's execution (before containers start). Examples:

* Validate or create checkpoint topic (if using KafkaCheckpointManager)
* Validate or create state topic (if using LoggedStore)

Right now, we have to do this in the container, which means that there's a race 
condition when running on YARN, as each container will try to create the same 
topic.

Initially, I thought this logic could be put in the YARN AM, but then we'd have 
to put corresponding logic in the LocalJobFactory. This gets problematic if we 
implement SAMZA-41, since there would no longer be a central place to do a 
"before job starts" operation with the LocalJobFactory. If we don't do 
SAMZA-41, then we should be fine putting this logic in the YARN AM and 
LocalJobFactory.

Alternatively, we could put this logic in JobRunner. One downside to this is 
that it would mean the JobRunner would need full access to the grid that it was 
trying to execute on (not just the RM) so that it could talk to Kafka/ZooKeeper 
(for example). I think this is actually fine, since we always execute our jobs 
from a spot that has access to the full grid.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to