[
https://issues.apache.org/jira/browse/SAMZA-438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chris Riccomini updated SAMZA-438:
----------------------------------
Attachment: SAMZA-438-0.patch
Changes made:
# Removed environment variable compression (SAMZA-337), since we're passing via
HTTP now.
# Converted YARN's WebAppServer to a generic HttpServer in samza-core.
# Wrote JobServlet, which serves a job's config, task:SSP mapping, and
task:changelog partition mapping via HTTP JSON.
# Updated ShellCommandBuilder/CommandBuilder to set an HTTP URL environment
variable, rather than config, task:SSP mapping, and task:changelog partition
mapping.
# Updated ProcessJob, ThreadJob, and the AM code to run the
HttpServer/JobServlet, and set the HTTP URL environment variable when starting
SamzaContainers.
# Update SamzaContainer to fetch config, task:SSP mapping, and task:changelog
partition mapping using HTTP URL environment variable.
# Changed container name to container ID in SamzaContainerContext, and
run-\*.sh. Kept legacy "samza.container.name" system property, so we're
backwards compatible with log4j.properties files that refer to it (hello-samza).
# Wrote a JsonHelpers class to help with all the terrible back bending we have
to do to make Scala work with Jackson.
# Updated Util to remove the compress methods, and add a helper "read" method
for reading from HTTP URLs.
Remaining items in this ticket:
# Write tests.
Tickets I'd like to open as follow ons:
# Remove TaskNamesToSystemStreamPartitions, and convert to a proper
job/container/task data model. Make the data model Java based, and use Jackson
annotations.
# Convert the HttpServer to be a proper Jetty/Jersey/Jackson server that uses
the properly defined data model to serve its content.
> Pass config via HTTP
> --------------------
>
> Key: SAMZA-438
> URL: https://issues.apache.org/jira/browse/SAMZA-438
> Project: Samza
> Issue Type: Sub-task
> Reporter: Chris Riccomini
> Assignee: Chris Riccomini
> Attachments: SAMZA-438-0.patch
>
>
> SAMZA-348 has a detailed design proposal on how we can configure Samza via a
> stream. Part of this work involves converting the SamzaContainer to retrieve
> its information via an HTTP/JSON request, rather than from JSON encoded
> environment variables.
> The three items that we'll need to serve via a "job coordinator" will be the
> job's config, the job's container:task:SSP mappings, and the job's
> task:changelog partition mappings.
> We'll also need to assign each container a unique ID, so that it can retrieve
> its task:SSP mappings (and thus, its task:changelog partition mappings).
> This ticket does not encompass creating a "job coordinator". Instead, we'll
> just start the HTTP server in three places: the ThreadJob, the ProcessJob,
> and the YARN AM. In all three cases, a URL will be set via an environment
> variable, which the SamzaContainer will receive, and use to retrieve its
> information.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)