Canbin Zheng created FLINK-16194:
------------------------------------
Summary: Refactor the Kubernetes resouces construction architecture
Key: FLINK-16194
URL: https://issues.apache.org/jira/browse/FLINK-16194
Project: Flink
Issue Type: Improvement
Components: Deployment / Kubernetes
Affects Versions: 1.10.0
Reporter: Canbin Zheng
Fix For: 1.11.0
So far, Flink has made efforts for the native integration of Kubernetes.
However, it is always essential to evaluate the existing design and consider
alternatives that have better design and are easier to maintain in the long
run. We have suffered from some problems while developing new features base on
the current code. Here is some of them:
# We don’t have a unified monadic-step based orchestrator architecture to
construct all the Kubernetes resources.
** There are inconsistencies between the orchestrator architecture that client
uses to create the Kubernetes resources, and the orchestrator architecture that
the master uses to create Pods; this confuses new contributors, as there is a
cognitive burden to understand two architectural philosophies instead of one;
for another, maintenance and new feature development become quite challenging.
** Pod construction is done in one step. With the introduction of new features
for the Pod, the construction process could become far more complicated, and
the functionality of a single class could explode, which hurts code
readability, writability, and testability. At the moment, we have encountered
such challenges and realized that it is not an easy thing to develop new
features related to the Pod.
** The implementations of a specific feature are usually scattered in multiple
decoration classes. For example, the current design uses a decoration class
chain that contains five Decorator class to mount a configuration file to the
Pod. If people would like to introduce other configuration files support, such
as Hadoop configuration or Keytab files, they have no choice but to repeat the
same tedious and scattered process.
# We don’t have dedicated objects or tools for centrally parsing, verifying,
and managing the Kubernetes parameters, which has raised some maintenance and
inconsistency issues.
** There are many duplicated parsing and validating code, including settings
of Image, ImagePullPolicy, ClusterID, ConfDir, Labels, etc. It not only harms
readability and testability but also is prone to mistakes. Refer to issue
FLINK-16025 for inconsistent parsing of the same parameter.
** The parameters are scattered so that some of the method signatures have to
declare many unnecessary input parameters, such as
FlinkMasterDeploymentDecorator#createJobManagerContainer.
For solving these issues, we propose to
# Introduce a unified monadic-step based orchestrator architecture that has a
better, cleaner and consistent abstraction for the Kubernetes resources
construction process.
# Add some dedicated tools for centrally parsing, verifying, and managing the
Kubernetes parameters.
Refer to the design doc for the details, any feedback is welcome.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)