mnojek opened a new pull request #21382: URL: https://github.com/apache/airflow/pull/21382
**The story behind this change:** This change is somehow inspired by the [AIP-47](https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-47+New+design+of+Airflow+System+Tests) that is still in the *discussion* phase. In this new design of system tests, we need to make use of [Trigger Rules](https://airflow.apache.org/docs/apache-airflow/stable/concepts/dags.html#trigger-rules) for handling `teardown` tasks (e.g. for cleaning resources required by recently executed the system test). The main concern is that when we use `teardown` task with trigger rule `all_done` (to make sure that it is executed even if something's gone wrong in the middle of the test), the whole test (which is just a DAG) will take the result from this particular `teardown` task and we can lose the information about some failing task in the middle. This is not expected workflow for the tests, because we want the test to fail if any step (task) failed. The reason why the whole test gets the status of a `teardown` task and not signalizes that anyt hing failed in the middle is that Airflow works like this - the DAG Run status is determined by the status of the "leaf nodes" (the tasks that do not have any children). Since the `teardown` task is the leaf node, the whole test gets the same status (which is almost always `success`). That's why we need to have another `watcher` task with trigger rule set to `one_failed` that is a downstream task for any other task in the test (DAG). Thanks to this, it will be triggered if any of the task in the DAG failed and thus its status will be propagated to the DAG Run (because it is a leaf node). By doing the research in the documentation and code, I found it very difficult to find the information how trigger rules work in the details and that's why I thought that it would be good to extend the documentation for them. Since I already spent some time to understand it deeply, I also took the effort and prepared this PR. It's not big, but it took me several hours to prepare these statements. I am not sure if all the statements are correct, so please read it carefully and correct me if I'm wrong and I will edit the PR. On the other hand, I would like to also start a discussion about the trigger rules. To me they seemed simple at first, but the more time I spent with them I figured out that they introduce a lot of complications to the task execution. I hope that this PR will make it easier to understand how they work. If you have any ideas how we can make them even better, I am glad to discuss it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
