[
https://issues.apache.org/jira/browse/SAMZA-375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14149910#comment-14149910
]
Chris Riccomini commented on SAMZA-375:
---------------------------------------
Expanding on Jon's comments. The lifecycle of a Samza job in YARN looks like
this:
# User runs run-job.sh with a config.
# run-job.sh negotiates with the YARN RM for a place to start Samza's YARN AM.
# The YARN RM (and NM) start the job's AM on a random host in the cluster.
# The AM looks at the job's config, and figures out how many containers its
needs, what size, etc to run the job in question.
# The AM requests containers from the YARN RM according to what's needed based
on the job's config.
# The AM heartbeats to the RM every N seconds, and gets containers assigned to
it as time passes (similar to offers in Mesos).
# When the AM receives a new container, it tells YARN to start the container
(on a remote host) by running the bash command run-container.sh. It also sets
some environment variables when it's talking to YARN about starting the
container, which YARN forwards to the SamzaContainer that's started. These
variables tell the SamzaContainer which partitions (Kafka input partitions) it
should read from, and also the job's configuration. Using these two things, the
SamzaContainer starts up and runs.
# When a SamzaContainer fails, YARN's RM notifies the AM (during the AM's next
heartbeat) that a container is dead. The AM then requests a new container, and
when it arrives, assigns the dead partitions to it (via environment variable,
again), and starts it up.
> Investigate Mesos Job Support
> -----------------------------
>
> Key: SAMZA-375
> URL: https://issues.apache.org/jira/browse/SAMZA-375
> Project: Samza
> Issue Type: Bug
> Components: hello-samza
> Reporter: Jon Bringhurst
> Assignee: Jon Bringhurst
> Labels: mesos, project
> Attachments: Screen Shot 2014-08-23 at 5.51.39 PM.png, Screen Shot
> 2014-09-22 at 8.59.12 AM.png
>
>
> It would be nice if Samza had support for Mesos (https://mesos.apache.org/).
> The current plan is to create a MesosJob and MesosJobFactory, then look into
> what it would take to allow the AM code to act as a Mesos scheduler.
> The feasibility of this landing in trunk will be better understood after a
> rough prototype has been created.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)