cameronlee314 opened a new pull request #1532: URL: https://github.com/apache/samza/pull/1532
Issues: `StreamAppender` is coupled with Samza being run in YARN: 1. `StreamAppender` specifically checks for `isApplicationMaster`, which is a YARN-only term, and it has different initialization logic if it is or isn't in the application master. 2. `StreamAppender` is coupled with `JobModelManager`, which is only used in YARN. 3. `StreamAppender` can only get configs from a specific HTTP coordinator endpoint, so it can't be used in other setups. For running in Kubernetes (or other environments in general), this class can be generalized better. Changes: 1. Add a `LoggingContextHolder` which provides general access to config for log appenders regardless of container type. This is still a static class, similar to how appenders were already using the static `JobModelManager.currentJobModelManager`. `StreamAppender` gets config from this `LoggingContextHolder`; it no longer reads the config from `JobModelManager` or an HTTP coordinator endpoint. `JobModelManager` and `ContainerLaunchUtil` were updated to set the config in the `LoggingContextHolder`. 2. Update `StreamAppender` (log4j, log4j2) to have a consistent flow for checking when it can initialize itself, instead of checking application master vs. worker. 3. Refactoring included a bug fix in which the first log event after system initialization was not sent to the log stream. 4. `StreamAppender.getPartitionCount` only returns the overridden partition count value instead of falling back to "job.container.count". This was done to prevent access to the config before it was available, and this is more consistent with the log4j getter pattern for configured values. Tests: 1. Updated unit tests 2. Ran integration tests API/usage changes: 1. Constructor for `org.apache.samza.logging.log4j2.StreamAppender` has an extra argument for `LoggingContextHolder`. Any classes which extend this need to be updated to pass the extra argument. Creation of `org.apache.samza.logging.log4j2.StreamAppender` in `log4j2.xml` does not change. 2. `StreamAppender.getPartitionCount` (for both log4j and log4j2) does not fall back to reading "job.container.count". Any classes which use that method and rely on "job.container.count" as a fallback need to be updated to read that config on their own. 3. In worker containers, logs that happen before the call to `LoggingContextHolder.INSTANCE.setConfig` in `ContainerLaunchUtil` will no longer be handled by `StreamAppender`. There are few logs that are impacted by this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
