Hi all, While working on TEZ-3991 <https://issues.apache.org/jira/browse/TEZ-3991> I'm starting to see feature variability that creates a lot of tangled/scattered code. Features like session-mode and local-mode scatter themselves using a lot of hard to manage "if" conditions.
I think it was ok with just these modes, but if we introduce more modes like: client-unmanaged AM, non-YARN AM, non-YARN task containers, Zookeeper AM discovery-mode, Kubernetes mode, etc... this starts to get out-of-control. It's like that problem that Aspect-Oriented Programming and DependencyInjection were trying to solve (except I find those solutions to be too heavy-weight). In C++ we might use something like pre-processor conditions. Plugins don't help completely because we end up with families of plugins/configurations that have a requirement to be used together. Based on a suggestion from Hitesh, I'd like to try and modularize these variabilities using java.util.ServiceLoader. Each "mode" would have it's own Service that would be used as a factory for other classes that provide variability (as a family). A simplified example would be that the session-mode Service would create a class to handle variability in TezClient and another class to handle variability in the AM. This is basically like extension points in Eclipse. We could add extension points as needed, and unless a Service explicitly provides a behavior for an extension point, it will act as a no-op (so older services don't need to change when new extension points are added). I'm thinking to try this out with an implementation for SessionMode and LocalMode, as a baseline, before introducing it for other new modes. What do people think? Any feedback on the idea would be appreciated.
