cameronlee314 opened a new pull request #1566: URL: https://github.com/apache/samza/pull/1566
Issues: In YARN today, a deployment attempt id which is shared between containers for a single Samza job can be extracted from the exec-env-container-id in the diagnostics information emitted by a job. However, in other execution environments (e.g. Kubernetes), there may not be an exec-env-container-id that can be parsed to get a deployment attempt id. It isn't ideal that this attempt id must be extracted from another field. Changes: 1. Add an explicit exec-env-attempt-id field to `MetricsHeader` and emit this field in the diagnostics messages. 2. Update classes to accept attempt id as an argument or extract it from the `SAMZA_EPOCH_ID` environment variable. Testing: 1. Updated unit tests (including compatibility tests for with and without the extra field) 2. Deployed a test job in minikube and checked the emitted diagnostics messages for the new attempt id field in the header API changes (backwards compatible): JSON representation of `MetricsHeader` (used in `MetricsSnapshot`) has an additional optional field called `exec-env-attempt-id`, and this field is intended to contain the deployment attempt id which is shared across all containers (job coordinator and wokrers) of a Samza job. This field is filled in by reading the `SAMZA_EPOCH_ID` environment variable. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
