potiuk edited a comment on issue #17490:
URL: https://github.com/apache/airflow/issues/17490#issuecomment-894785162


   > For example, in some cases PVCs are used in an 'assembly-line' where they 
are moved from one task to another and in others they are meant to be bound to 
the life-cycle of a kube Job and forgotten about -- implicitly created and 
destroyed together. An elegant way to surface these two kinds of relationships 
to PVCs is what I was puzzling over for a while.
   
   One thing to remember is that while Airflow loves K8S, K8S is not the only 
way it is and will be deployed, we do not want people to tie their task 
implementation with the fact that they are run on K8S. Task while executing 
should be largely unaware of what deployment it runs on.
   
   I think for that a custom XCom Backend with PVC support for K8S could be 
useful (and should be possible to write). The fact that task runs on K8S  
should be a deployment detail, but if there is some inter-task communication, 
Airflow has its own deployment-independent mechanism - namely XCom. And as of 
recently we have the capability of implementing custom XCom backends which 
serve precisely the purpose and should be used for anything-data-sharing in 
Airflow.
   
   I think you could write an XComBackend implementation that uses PVC under 
the hood for those who want to use it and run their tasks on K8S (and possibly 
contribute it back to Airflow).
   
   You can learn more about those concepts and how Airflow's "love" to 
Kubernetes and custom XCom approach that we are starting to make "full use of" 
in some of the talks from the recent Airflow Summit:
   
   * Why and how Airflow Loves Kubernetes: 
https://airflowsummit.org/sessions/2021/airflow-loves-kubernetes/
   * How Ray will integrate with Airflow shortly (in-short via cust XCom 
Backends and custom decorators)  : 
https://airflowsummit.org/sessions/2021/airflow-ray/
   * How Custom XCom Backends work for data sharing: 
https://airflowsummit.org/sessions/2021/customizing-xcom-to-enhance-data-sharing-between-tasks/
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to