Hello all,

Airflow 2.0 release is sooner and sooner. I would like to start a
discussion about custom XCom backends.

First of all, if you don't know it - since 1.10.12 users can use a
custom XCom class that will override serialize and deserialize
methods. Docs: 
https://airflow.apache.org/docs/stable/concepts.html#custom-xcom-backend

This feature allows users the following things:
- reduce boilerplate code responsible for downloading / uploading data
in operators (it's handled by custom XCom)
- use different storage for XCom data (other database, buckets, cache etc.)
- verifying XCom data on read/write operations
- and anything else that may be feasible

Some examples:
https://github.com/apache/airflow/pull/12733
https://www.polidea.com/blog/airflow-2-0-dag-authoring-redesigned/#custom-xcom-backends-8560

The point I want to raise (as I did in this PR
https://github.com/apache/airflow/pull/12733) is to discuss if we as a
community want to have custom XComs in our codebase (core or
providers). I'm happy to hear what the community thinks about it?

>From my side, I'm leaning toward creating better documentation around
this feature (with examples and suggestions) instead of accepting
XComs to code base. My main concern is that custom XComs are easy to
write (using for example hooks) and will work best when they are built
to suit exact users' needs. On the other hand, I see some potential in
"low level" XComs that just implement logic of storing and retrieving
data from particular storage. But anything that gets too use-case /
data type specific should not be accepted.

Cheers,
Tomek

Reply via email to