dstandish edited a comment on issue #6210: [AIRFLOW-5567] BaseReschedulePokeOperator URL: https://github.com/apache/airflow/pull/6210#issuecomment-558030520 > Having a state table will have a fundamental impact on the idempotency of the execution of the tasks. It's optional to use such a thing. Just like it is with XCom. If you don't use it, nothing is changed. > Why would the manual triggering of a dag introduce issues, the execution date will be equal to the moment that it was triggered. I think it should work as well. Because execution_date is run date minus one interval, and `xcom_pull` sorts by execution_date. So, suppose I want to persist state with XCom (which I do in many jobs), and I have a daily job, running at midnight. At end of each run, we push some value to XCom. At start of next job, we retrieve last value and use it somehow. Consider this case: * run 1: 12am D1 * run 2: manually triggered at 8am (exec date is D1 8am; xcom retrieves from run 1) * run 3: 12am D2 * run 4: 12am D3 * run 5: 12am D4 Outcome: * Run 3 will retrive the XCom from run 1, because its execution date is prior to run 2 execution date. * Run 4 retrieves run 2 for same reason. * Run 5 retrieves run 4 (finally things are back in order); run 3 xcom is never retrieved by any job. The schedule interval edge PR would resolve the execution date ordering problem. But if XCom is cleared at start of task, it is remains problematic as a mechanism for state persistence. > Since this will introduce such as a fundamental change to the way operators were intended, being idempotent, I think it would be great to first start an AIP on the topic, so we can have a clear and structured approach. An AIP sounds appropriate. But I'll just note that I see this more as better support for common use pattern rather than fundamental change of anything. I suspect stateful use of airflow (including the use of XCom as state persistence mechanism) is quite common.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
