mynameborat opened a new pull request #1421:
URL: https://github.com/apache/samza/pull/1421


   *Issues*
   Currently locality information is part of job model. Job model typically is 
immutable and fixed within the lifecycle of an application attempt. The 
locality information on the other hand is dynamic and changes in the event of 
container movements. Due to this difference, it makes it complicated to 
program, model or define semantics around these models when building features. 
Furthermore, the implications of this dependency is as follows
   
   1. Enables us to move JobModel to public APIs and expose it in JobContext
   2. Enables us to cache and serve serialized JobModel from the AM servlet to 
reduce AM overhead (memory, open connections, num threads) during container 
startup, esp. for jobs with a large number of containers (See: 
https://github.com/apache/samza/pull/1241)
   3. Removes tech debt: models should be immutable, and should not update 
themselves.
   4. Removes tech debt: makes current container location a first class concept 
for container scheduling / placement , and for tools like dashboard, 
samza-rest, auto-scaling, diagnostics etc.
   
   *Changes*
   
   1. Separated out locality information out of job model into `LocalityModel`
   2. Introduced an endpoint in AM to serve locality information
   3. Added Json MixIns for locality models (LocalityModel & HostLocality) 
   
   *Tests*
   
   1. Added tests for new servlet
   2. Modified existing tests to reflect the refactor
   3. Deployed the new servlet and verified the locality information is 
accessible
   
   *API Changes*: 
   
   1. Introduced new models for locality. 
   2. Previous job model endpoint will no longer serve locality information. 
i.e. tools using these will need to update to use the new endpoint; refer usage 
instructions for details.
   
   *Upgrade Instructions*: None. Refer to the API changes & the usage 
instructions below to upgrade your tooling if applicable.
   *Usage Instructions*: The new locality information is served under am 
endpoint within `locality` sub page. Any tooling will now hit 
`http://<am-endpoint>/locality` instead of `http://<am-endpoint>`.
   The endpoint supports two types of queries
   
   1. Querying for locality information of the entire job. It can be done by 
hitting the `http://<am-endpoint>/locality`. A sample response will look like 
the following 
   
   ```
   {
     host-localities: {
       0: {
         id: "0",
         host: "bkumaras-ld2",
         jmx-url: "",
         jmx-tunneling-url: ""
       }
     }
   }
   ```
   
   2. Querying for specific processor locality information. It can be done by 
specifying the `processorId` in the request. e.g. `GET 
<am-enpoint>/locality?processorId=x`. A sample response will look like the 
following
   ```
   {
     id: "0",
     host: "mynameborat-host",
     jmx-url: "",
     jmx-tunneling-url: ""
   }
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to