davlum commented on issue #5481: [AIRLFOW-4851] Refactor K8S codebase with k8s 
API models
URL: https://github.com/apache/airflow/pull/5481#issuecomment-506757977
 
 
   I have a couple thoughts, wondering if I could get some feedback. Currently 
there are four places where a pod can be configured/created that I am aware of. 
   1. From the `airflow.cfg` with `KubernetesExecutor`.
   2. From an Operator with `KubernetesExecutor` using the argument 
```executor_config = { 'KubernetesExecutor': { ... }}```
   3. From the `KubernetesPodoperator`.
   4. From the  `pod_mutation_hook`.
   
   Ideally there'd be one interface for all of these, whereas currently there 
seems to be several. 1. uses `WorkerConfiguration`, 2. uses 
`KubernetesExecutorConfig`, 3. uses `PodGenerator`. Each of these in turn offer 
a different level of coverage of the Kubernetes API, which then creates our 
custom `Pod` object which itself must implement all parts of the Kubernetes API 
and offer a serialization method into JSON which conforms with the API. Ideally 
we would offer a very thin layer of abstraction over the creation of a 
`V1Pod`object for convenience (and have this be largely backwards compatible) 
_and_  offer creating `V1Pod` object totally raw. This is similar to how in an 
ORM the abstraction doesn't offer every possible feature of SQL, and you might 
need to write raw SQL.
   
   My question would be regarding backwards incompatible changes. For example, 
the `KubernetesPodOperator` takes some of Airflow's internal kubernetes models 
as arguments, such as `list[airflow.kubernetes.pod.Port]`. I have less 
hesitation about changing that as it resides in `/contrib`. As for the 
`executor_config`, I think I can arrange to have it backwards compatible for 
the most part I believe. 
   
   The ideal scenario in my mind is that the full `V1Pod` be exposed to users 
if they need, which would address tickets such as 
[AIRLFOW-4454](https://issues.apache.org/jira/browse/AIRFLOW-4454) and 
[AIRFLOW-3152](https://issues.apache.org/jira/browse/AIRFLOW-3152) as they 
would have full access to the API. In another ticket, we could add 
functionality to just pass in configuration as a JSON/YAML string/file, which 
was discussed briefly [in the mailing 
list](https://lists.apache.org/thread.html/313132da516fca340243f4927e64177ea393d5ccae829d96b99f8e16@%3Cdev.airflow.apache.org%3E).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to