Chris Riccomini created SAMZA-42:
------------------------------------
Summary: Add a job setup phase to Samza
Key: SAMZA-42
URL: https://issues.apache.org/jira/browse/SAMZA-42
Project: Samza
Issue Type: Bug
Components: container
Affects Versions: 0.6.0
Reporter: Chris Riccomini
We have several use cases for doing things once at the beginning of a Samza
job's execution (before containers start). Examples:
* Validate or create checkpoint topic (if using KafkaCheckpointManager)
* Validate or create state topic (if using LoggedStore)
Right now, we have to do this in the container, which means that there's a race
condition when running on YARN, as each container will try to create the same
topic.
Initially, I thought this logic could be put in the YARN AM, but then we'd have
to put corresponding logic in the LocalJobFactory. This gets problematic if we
implement SAMZA-41, since there would no longer be a central place to do a
"before job starts" operation with the LocalJobFactory. If we don't do
SAMZA-41, then we should be fine putting this logic in the YARN AM and
LocalJobFactory.
Alternatively, we could put this logic in JobRunner. One downside to this is
that it would mean the JobRunner would need full access to the grid that it was
trying to execute on (not just the RM) so that it could talk to Kafka/ZooKeeper
(for example). I think this is actually fine, since we always execute our jobs
from a spot that has access to the full grid.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira