nicolamarangoni opened a new issue, #27180:
URL: https://github.com/apache/airflow/issues/27180

   ### Apache Airflow version
   
   2.4.1
   
   ### What happened
   
   We have an AirFlow-Deployment on **AWS Fargate** where each service 
(webserver, scheduler, dag-processor, celery-workers) are running in their own 
task, each task has a different hostname and ip.
   We also have a separated task with a **datadog-agent** running a statsd 
(**dogstatsd**) server to which we want to push airflow metrics.
   We have `statsd_host=DATADOG_AGENT_HOST` and `statsd_datadog_enabled="True"`.
   Statsd successfully collect metrics like **dagbag_size** and 
**scheduler_heartbeat** but **NOT** **healthy** and **can_connect**.
   We similar deployment on k8s in which all service containers run in the same 
pod, also with a single hostname and ip for everything. In some k8s setups we 
have the datadog-agent in the same pod in another pod. In the second case we 
set **statsd_host = DDAGENT_POD_HOST** and we successfully collect all metrics.
   
   My suspect is that metrics are not pushed to statsd from every service 
independently but **only by the scheduler service**.
   The scheduler service check **health** and **can_connect** and push them to 
statsd. However these 2 metrics originates in the webserver. If the webserver 
runs in a different host than the scheduler (like different tasks in Fargate), 
the metrics are not collected.
   
   Is it possible that the scheduler is the only service that push metrics to 
statsd and that it can collect metrics from other services always searching for 
them on localhost?
   
   ### What you think should happen instead
   
   All metrics should be sent to statsd.
   
   ### How to reproduce
   
   Setup AirFlow with every service running in a separated pod/host/task with 
its own ip and hostname.
   Activate statsd.
   
   ### Operating System
   
   Debian GNU/Linux 11 (bullseye)
   
   ### Versions of Apache Airflow Providers
   
   apache-airflow-providers-amazon==6.0.0
   apache-airflow-providers-celery==3.0.0
   apache-airflow-providers-cncf-kubernetes==4.4.0
   apache-airflow-providers-common-sql==1.2.0
   apache-airflow-providers-databricks==3.3.0
   apache-airflow-providers-datadog==3.0.0
   apache-airflow-providers-dbt-cloud==2.2.0
   apache-airflow-providers-docker==3.2.0
   apache-airflow-providers-elasticsearch==4.2.1
   apache-airflow-providers-ftp==3.1.0
   apache-airflow-providers-google==8.3.0
   apache-airflow-providers-grpc==3.0.0
   apache-airflow-providers-hashicorp==3.1.0
   apache-airflow-providers-http==4.0.0
   apache-airflow-providers-imap==3.0.0
   apache-airflow-providers-microsoft-azure==4.3.0
   apache-airflow-providers-mysql==3.2.1
   apache-airflow-providers-odbc==3.1.2
   apache-airflow-providers-postgres==5.2.2
   apache-airflow-providers-redis==3.0.0
   apache-airflow-providers-sendgrid==3.0.0
   apache-airflow-providers-sftp==4.1.0
   apache-airflow-providers-slack==5.1.0
   apache-airflow-providers-snowflake==3.3.0
   apache-airflow-providers-sqlite==3.2.1
   apache-airflow-providers-ssh==3.2.0
   
   ### Deployment
   
   Other
   
   ### Deployment details
   
   AirFlow on AWS Fargate (ECS) with separated tasks for every service.
   
   ### Anything else
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to