bolkedebruin commented on code in PR #37058: URL: https://github.com/apache/airflow/pull/37058#discussion_r1469303440
########## docs/apache-airflow/core-concepts/xcoms.rst: ########## @@ -56,6 +56,31 @@ XComs are a relative of :doc:`variables`, with the main difference being that XC If the first task run is not succeeded then on every retry task XComs will be cleared to make the task run idempotent. + +Object Storage XCom Backend +--------------------------- + +The default XCom backend is the :class:`~airflow.models.xcom.BaseXCom` class, which stores XComs in the Airflow database. This is fine for small values, but can be problematic for large values, or for large numbers of XComs. + +To enable storing XComs in an object store, you can set the ``xcom_backend`` configuration option to ``airflow.io.xcom.XComObjectStoreBackend``. You will also need to set ``xcom_objectstorage_path`` to the desired location. The connection +id is obtained from the user part of the url the you will provide, e.g. ``xcom_objectstorage_path = s3://conn_id@mybucket/key``. Furthermore, ``xcom_objectstorage_threshold`` is required +to be something larger than -1. Any object smaller than the threshold in bytes will be stored in the database and anything larger will be be +put in object storage. This will allow a hybrid setup. If an xcom is stored on object storage a reference will be +saved in the database. Finally, you can set ``xcom_objectstorage_compression`` to fsspec supported compression methods like ``zip`` or ``snappy`` to +compress the data before storing it in object storage. + +So for example the following configuration will store anything above 1MB in S3 and will compress it using snappy:: + + xcom_backend = airflow.io.xcom.XComObjectStoreBackend + xcom_objectstorage_path = s3://conn_id@mybucket/key + xcom_objectstorage_threshold = 1048576 + xcom_objectstorage_compression = snappy + +.. note:: + + Compression requires the support for it is installed in your python environment. For example, to use ``snappy`` compression, you need to install ``python-snappy``. Review Comment: gzip is now used -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
