Hi All, Just wondering what some of the best options are to do more advance alerting and anomaly detection on task metrics within airflow.
Currently we have a job that sends metrics for each task run to Anodot <https://www.anodot.com/> which is a really cool tool. However as our dags tend to have many tasks and i'm sending about 6 or so metrics for each dag run from the airflow database, i've blown through the 50k monthly metrics our Anodot licence covers. So just wondering what might be a more native way to do task monitoring in Airflow if there is one. Main use case here is to catch cases where even though a job is still running its behaviour has changed significantly which may be a sign of something that needs investigation. Cheers, Andy
