Dear Airflow community, I've been looking to make a new SLA proposal by crowdsourcing all the known issues and proposals that have been raised over the last couple of years. In summary, I believe that defining SLAs at the task level puts too much work on the scheduler, if we were to solve all of the known issues. Given this clear downside, task-level callbacks may not be strictly necessary, especially with tools like DateTimeTriggers that can substitute the function of task-level SLA callbacks.
On the other hand, I believe that SLAs defined at the DAG level will be extremely useful as a 'catch-all' alert in case anything goes wrong in a DAG (e.g. significant delays in task execution, tasks stuck in queued state). In addition, SLAs defined at the DAG level will be incredibly lightweight to detect and execute callbacks for. Hence, I'd like to propose that SLAs be defined as a DAG-level attribute. If you are interested in this feature, please take a look at my detailed proposal and example implementations in the below Google Doc. Please feel free to reply to this message, or simply leave comments in the Doc. https://docs.google.com/document/d/1drNaYmAy6GqC4WGGn4MNt6VqbOwVNm7jPfmr5Pc52AU/edit?usp=sharing Thank you! Github: syun64 <https://github.com/syun64> -- Sung Yun Cornell '20 Master of Engineering in Computer Science
