mnojek opened a new pull request #21382:
URL: https://github.com/apache/airflow/pull/21382


   **The story behind this change:**
   This change is somehow inspired by the 
[AIP-47](https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-47+New+design+of+Airflow+System+Tests)
 that is still in the *discussion* phase. In this new design of system tests, 
we need to make use of [Trigger 
Rules](https://airflow.apache.org/docs/apache-airflow/stable/concepts/dags.html#trigger-rules)
 for handling `teardown` tasks (e.g. for cleaning resources required by 
recently executed the system test). The main concern is that when we use 
`teardown` task with trigger rule `all_done` (to make sure that it is executed 
even if something's gone wrong in the middle of the test), the whole test 
(which is just a DAG) will take the result from this particular `teardown` task 
and we can lose the information about some failing task in the middle. This is 
not expected workflow for the tests, because we want the test to fail if any 
step (task) failed. The reason why the whole test gets the status of a 
`teardown` task and not signalizes that anyt
 hing failed in the middle is that Airflow works like this - the DAG Run status 
is determined by the status of the "leaf nodes" (the tasks that do not have any 
children). Since the `teardown` task is the leaf node, the whole test gets the 
same status (which is almost always `success`). 
   That's why we need to have another `watcher` task with trigger rule set to 
`one_failed` that is a downstream task for any other task in the test (DAG). 
Thanks to this, it will be triggered if any of the task in the DAG failed and 
thus its status will be propagated to the DAG Run (because it is a leaf node).
   
   By doing the research in the documentation and code, I found it very 
difficult to find the information how trigger rules work in the details and 
that's why I thought that it would be good to extend the documentation for 
them. Since I already spent some time to understand it deeply, I also took the 
effort and prepared this PR. It's not big, but it took me several hours to 
prepare these statements. I am not sure if all the statements are correct, so 
please read it carefully and correct me if I'm wrong and I will edit the PR. 
   
   On the other hand, I would like to also start a discussion about the trigger 
rules. To me they seemed simple at first, but the more time I spent with them I 
figured out that they introduce a lot of complications to the task execution. I 
hope that this PR will make it easier to understand how they work. If you have 
any ideas how we can make them even better, I am glad to discuss it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to