cameronlee314 opened a new pull request #1566:
URL: https://github.com/apache/samza/pull/1566


   Issues: In YARN today, a deployment attempt id which is shared between 
containers for a single Samza job can be extracted from the 
exec-env-container-id in the diagnostics information emitted by a job. However, 
in other execution environments (e.g. Kubernetes), there may not be an 
exec-env-container-id that can be parsed to get a deployment attempt id. It 
isn't ideal that this attempt id must be extracted from another field.
   
   Changes:
   1. Add an explicit exec-env-attempt-id field to `MetricsHeader` and emit 
this field in the diagnostics messages.
   2. Update classes to accept attempt id as an argument or extract it from the 
`SAMZA_EPOCH_ID` environment variable.
   
   Testing:
   1. Updated unit tests (including compatibility tests for with and without 
the extra field)
   2. Deployed a test job in minikube and checked the emitted diagnostics 
messages for the new attempt id field in the header
   
   API changes (backwards compatible):
   JSON representation of `MetricsHeader` (used in `MetricsSnapshot`) has an 
additional optional field called `exec-env-attempt-id`, and this field is 
intended to contain the deployment attempt id which is shared across all 
containers (job coordinator and wokrers) of a Samza job. This field is filled 
in by reading the `SAMZA_EPOCH_ID` environment variable.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to