bhupixb opened a new issue #18630:
URL: https://github.com/apache/airflow/issues/18630


   ### Apache Airflow version
   
   2.1.3
   
   ### Operating System
   
   Debian GNU/Linux 10 (buster)
   
   ### Versions of Apache Airflow Providers
   
   apache-airflow-providers-amazon==2.0.0
   apache-airflow-providers-celery==2.0.0
   apache-airflow-providers-cncf-kubernetes==2.0.2
   apache-airflow-providers-docker==2.0.0
   apache-airflow-providers-elasticsearch==2.0.2
   apache-airflow-providers-ftp==2.0.0
   apache-airflow-providers-google==5.0.0
   apache-airflow-providers-grpc==2.0.0
   apache-airflow-providers-hashicorp==2.0.0
   apache-airflow-providers-http==2.0.0
   apache-airflow-providers-imap==2.0.0
   apache-airflow-providers-microsoft-azure==2.0.0
   apache-airflow-providers-mysql==2.1.0
   apache-airflow-providers-postgres==2.0.0
   apache-airflow-providers-redis==2.0.0
   apache-airflow-providers-sendgrid==2.0.0
   apache-airflow-providers-sftp==2.1.0
   apache-airflow-providers-slack==4.0.0
   apache-airflow-providers-sqlite==2.0.0
   apache-airflow-providers-ssh==2.1.0
   
   ### Deployment
   
   Other
   
   ### Deployment details
   
   We have modified official airflow helm chart to meet our needs.
   Kubernetes version: 1.15.12
   
   
   ### What happened
   
   We have followed official 
[documentation](https://airflow.apache.org/docs/apache-airflow/stable/logging-monitoring/metrics.html)
 for setting up metrics in airflow using statsd. Then we are using Prometheus 
to pull these metrics from statsd.
   Here is our configmap for statsd mapping.yaml https://ideone.com/cotYSG.
   
   The issue that we are facing is that in statsd we are not getting these 
metrics 
   `dagrun.duration.success.<dag_id>` and `dagrun.duration.failed.<dag_id>` in 
statsd. Most other
   metrics are coming fine.
   
   Our statsd configuration:
   metrics:
       statsd_on: 'True'
       statsd_port: 9125
       statsd_prefix: airflow
       statsd_host: airflow-statsd
   
   
   
    
   
   ### What you expected to happen
   
   Metrics `dagrun.duration.success.<dag_id>` and 
`dagrun.duration.failed.<dag_id>` should also come to statsd.
   These metrics are required to setup some alerts in prometheus e.g. for long 
running dags.
   
   ### How to reproduce
   
   We are using our custom written helm chart, so not sure how others can 
reproduce. But we are running this inside kubernetes cluster and statsd, 
scheduler & webserver are running inside their individual pods.
   
   ### Anything else
   
   This issue is happening for 99% of the time, a few time we see the above 2 
metrics in prometheus, but unable to find the correlation why it came on that 
specific time and for that dag.
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to