Canbin Zheng created FLINK-16194:
------------------------------------

             Summary: Refactor the Kubernetes resouces construction architecture
                 Key: FLINK-16194
                 URL: https://issues.apache.org/jira/browse/FLINK-16194
             Project: Flink
          Issue Type: Improvement
          Components: Deployment / Kubernetes
    Affects Versions: 1.10.0
            Reporter: Canbin Zheng
             Fix For: 1.11.0


So far, Flink has made efforts for the native integration of Kubernetes. 
However, it is always essential to evaluate the existing design and consider 
alternatives that have better design and are easier to maintain in the long 
run. We have suffered from some problems while developing new features base on 
the current code. Here is some of them:
 # We don’t have a unified monadic-step based orchestrator architecture to 
construct all the Kubernetes resources.
 ** There are inconsistencies between the orchestrator architecture that client 
uses to create the Kubernetes resources, and the orchestrator architecture that 
the master uses to create Pods; this confuses new contributors, as there is a 
cognitive burden to understand two architectural philosophies instead of one; 
for another, maintenance and new feature development become quite challenging.
 ** Pod construction is done in one step. With the introduction of new features 
for the Pod, the construction process could become far more complicated, and 
the functionality of a single class could explode, which hurts code 
readability, writability, and testability. At the moment, we have encountered 
such challenges and realized that it is not an easy thing to develop new 
features related to the Pod.
 ** The implementations of a specific feature are usually scattered in multiple 
decoration classes. For example, the current design uses a decoration class 
chain that contains five Decorator class to mount a configuration file to the 
Pod. If people would like to introduce other configuration files support, such 
as Hadoop configuration or Keytab files, they have no choice but to repeat the 
same tedious and scattered process.
 # We don’t have dedicated objects or tools for centrally parsing, verifying, 
and managing the Kubernetes parameters, which has raised some maintenance and 
inconsistency issues.
 ** There are many duplicated parsing and validating code, including settings 
of Image, ImagePullPolicy, ClusterID, ConfDir, Labels, etc. It not only harms 
readability and testability but also is prone to mistakes. Refer to issue 
FLINK-16025 for inconsistent parsing of the same parameter.
 ** The parameters are scattered so that some of the method signatures have to 
declare many unnecessary input parameters, such as 
FlinkMasterDeploymentDecorator#createJobManagerContainer.

 

For solving these issues, we propose to 
 # Introduce a unified monadic-step based orchestrator architecture that has a 
better, cleaner and consistent abstraction for the Kubernetes resources 
construction process. 
 # Add some dedicated tools for centrally parsing, verifying, and managing the 
Kubernetes parameters.

 

Refer to the design doc for the details, any feedback is welcome.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to