zhengcanbin opened a new pull request #11233: [FLINK-16194][k8s] Refactor the 
Kubernetes decorator design
URL: https://github.com/apache/flink/pull/11233
 
 
   ## What is the purpose of the change
   
   So far, Flink has made efforts for the native integration of Kubernetes. 
However, it is always essential to evaluate the existing design and consider 
alternatives that have better design and are easier to maintain in the long 
run. We have suffered from some problems while developing new features base on 
the current code. Here is some of them:
   
   1. We don’t have a unified monadic-step based orchestrator architecture to 
construct all the Kubernetes resources.
   2. We don’t have dedicated objects or tools for centrally parsing, 
verifying, and managing the Kubernetes parameters, which has raised some 
maintenance and inconsistency issues. 
   
   The ultimate goal of this PR is to evolve some of the designs. Here is a 
summary of the main evolution.
   
   1. Introduce a unified monadic-step based orchestrator architecture that has 
a better, cleaner and consistent abstraction for the Kubernetes resources 
construction process.
   2. Introduce some dedicated tools for centrally parsing, verifying, and 
managing the Kubernetes parameters.
   
   ## Open Questions
   1. For the new design, we change the owner from the internal Service to the 
Deployment for GC. There are several concerns:
   - We do not need the internal Service to forward request from TaskManger to 
JobManager in the HA mode, that Service would be removed in such a scenario in 
another issue.
   - Things like Deployment are the first citizen in Kubernetes, it is 
reasonable that one deletes the controller that runs the master leads to 
clean-up of all the other resources together representing that Application.
   
   2. For the new design, we don't listen to **ADD** event when creating the 
rest Service. The previous design assumes that the Service is ready once the 
client receives the **ADD** event. However, this is incorrect, no matter for 
the LB or the NodePort type. We plan to open another issue to fix this problem.
   
   ## Brief change log
   
   Main changes are:
   
   - 
[e629bbc](https://github.com/apache/flink/commit/e629bbc4091e9288f74e2d6a9cfd689daabeb4a3)
 Trivial code clean-up and test code normalization.
   - 
[675151e](https://github.com/apache/flink/commit/675151e02d7b91e0736963ed2b24f2b8c3ff7046):
 Remove the existing decorator design patterns.
   - 
[0355f0a](https://github.com/apache/flink/commit/0355f0a6d95bca530b4018fecf11db7560626956):
 Refactor and simplify KubernetesTestBase.
   - 
[edc3d23](https://github.com/apache/flink/commit/edc3d23742d64dcbd07b24b72b674c17ce06b6e7):
 Remove the Flink Configuration out of KubernetesResource.
   - 
[8d6e520](https://github.com/apache/flink/commit/8d6e5201b336a6238923292f96b9f0563a4f9029):
 Introduce some dedicated Kubernetes parameters parsing tools.
   - 
[c41a9a2](https://github.com/apache/flink/commit/c41a9a2b5a5f23b820cae038a0611d2d071c4ce9)
 to 
[23ed312](https://github.com/apache/flink/commit/23ed31201c0cd08d6bdb715721bba857ead2b520):
 Introduce the new Kubernetes decorator design pattern.
   - 
[710984c](https://github.com/apache/flink/commit/710984c24f05169a2c1e644676e05dbc573e6c3a):
 Rework the FlinkKubeClient to employ the new decorator pattern.
   - 
[a47f12d](https://github.com/apache/flink/commit/a47f12d54f832485ad54f273bc5a2f4901d4dce7)
 to 
[fb57917](https://github.com/apache/flink/commit/fb57917227853d0477aa1383d399a619146d7170):
 Minor improvements
   
   
   ## Verifying this change
   
   This PR adds several test classes and many unit tests to catch most of the 
test branch for the newly decorator design pattern.
   
   ## Does this pull request potentially affect one of the following parts:
   
     - Dependencies (does it add or upgrade a dependency): (yes / **no**)
     - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / **no**)
     - The serializers: (yes / **no** / don't know)
     - The runtime per-record code paths (performance sensitive): (yes / **no** 
/ don't know)
     - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: (yes / **no** / 
don't know)
     - The S3 file system connector: (yes / **no** / don't know)
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to