mynameborat opened a new pull request #1421: URL: https://github.com/apache/samza/pull/1421
*Issues* Currently locality information is part of job model. Job model typically is immutable and fixed within the lifecycle of an application attempt. The locality information on the other hand is dynamic and changes in the event of container movements. Due to this difference, it makes it complicated to program, model or define semantics around these models when building features. Furthermore, the implications of this dependency is as follows 1. Enables us to move JobModel to public APIs and expose it in JobContext 2. Enables us to cache and serve serialized JobModel from the AM servlet to reduce AM overhead (memory, open connections, num threads) during container startup, esp. for jobs with a large number of containers (See: https://github.com/apache/samza/pull/1241) 3. Removes tech debt: models should be immutable, and should not update themselves. 4. Removes tech debt: makes current container location a first class concept for container scheduling / placement , and for tools like dashboard, samza-rest, auto-scaling, diagnostics etc. *Changes* 1. Separated out locality information out of job model into `LocalityModel` 2. Introduced an endpoint in AM to serve locality information 3. Added Json MixIns for locality models (LocalityModel & HostLocality) *Tests* 1. Added tests for new servlet 2. Modified existing tests to reflect the refactor 3. Deployed the new servlet and verified the locality information is accessible *API Changes*: 1. Introduced new models for locality. 2. Previous job model endpoint will no longer serve locality information. i.e. tools using these will need to update to use the new endpoint; refer usage instructions for details. *Upgrade Instructions*: None. Refer to the API changes & the usage instructions below to upgrade your tooling if applicable. *Usage Instructions*: The new locality information is served under am endpoint within `locality` sub page. Any tooling will now hit `http://<am-endpoint>/locality` instead of `http://<am-endpoint>`. The endpoint supports two types of queries 1. Querying for locality information of the entire job. It can be done by hitting the `http://<am-endpoint>/locality`. A sample response will look like the following ``` { host-localities: { 0: { id: "0", host: "bkumaras-ld2", jmx-url: "", jmx-tunneling-url: "" } } } ``` 2. Querying for specific processor locality information. It can be done by specifying the `processorId` in the request. e.g. `GET <am-enpoint>/locality?processorId=x`. A sample response will look like the following ``` { id: "0", host: "mynameborat-host", jmx-url: "", jmx-tunneling-url: "" } ``` ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
