Hi folks!

We would like to propose a new feature in Airflow, a boolean
parameter  "persist_xcom_through_retry" Parameter in all Airflow Operators.
Our team added this feature in our internal fork a few years back, and it
has been benefiting our users extensively.

*I have created an AIP
at https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=399278333
<https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=399278333>*.
Below is a summary (in the complete AIP, we have a more detailed problem
statement and quite a few interesting use-case examples):




*Traditionally, XCom is defined as “a mechanism that lets Tasks talk to
each other”. However, XCom also has the capacity and potential to help
persist and manage task state within a task itself.Currently, Apache
Airflow automatically clears a task instance’s XCom data when it is
retried. This behavior, while ensuring clean state for retry attempts,
creates limitations:*

   - *Loss of Internal Progress: Tasks that have internal checkpointing or
   progress tracking lose all intermediate state on retry, forcing restart
   from the beginning.*
   - *Resource State Loss: Tasks cannot maintain state about allocated
   resources (compute instances, downstream job IDs, etc.) across retry
   attempts, leading to redundant expensive setup operations.*
   - *No Recovery/Resume Capability: There's no way for tasks to resume
   from internal checkpoints when transient failures occur during
   long-running atomicoperations.*
   - *Poor User Experience: users must implement external state management
   systems to work around this limitation, adding complexity to DAG authoring.*


*This proposal aims at extending the capacity of XCom by allowing
persisting a Task Instance’s XCom through its retries, enabling users to
build more resilient and efficient pipelines. This is particularly useful
for the type of tasks which are atomic (so one such task cannot be split
into multiple tasks) and need to manage internal state or checkpoints. *


We look forward to your feedback and thoughts. Thanks!


Regards,

XD

Reply via email to