ahronrosenboimvim opened a new issue, #47992:
URL: https://github.com/apache/airflow/issues/47992

   ### Apache Airflow version
   
   Other Airflow 2 version (please specify below)
   
   ### If "Other Airflow 2 version" selected, which one?
   
   2.9.0
   
   ### What happened?
   
   I've configured Airflow 2.9.0 with OpenTelemetry metrics integration to send 
metrics to Prometheus via an OpenTelemetry collector. While task instance 
creation metrics (airflow_task_instance_created_*) are correctly appearing in 
Prometheus, task instance state metrics like ti.start and ti.finish are not 
appearing, despite being emitted in the code.
   In taskinstance.py the code emits these metrics in the _run_raw_task 
function:
   ```
   Stats.incr(f"ti.start.{ti.task.dag_id}.{ti.task.task_id}", 
tags=ti.stats_tags)
   # Same metric with tagging
   Stats.incr("ti.start", tags=ti.stats_tags)
   ```
   And later:
   ```
   Stats.incr(f"ti.finish.{ti.dag_id}.{ti.task_id}.{ti.state}", 
tags=ti.stats_tags)
   # Same metric with tagging
   Stats.incr("ti.finish", tags={**ti.stats_tags, "state": str(ti.state)})
   ```
   However, neither of these metrics appear in Prometheus, while other metrics 
like dag processing metrics and task instance creation metrics are visible.
   
   ### What you think should happen instead?
   
   The task instance state metrics (ti.start, ti.finish) should be visible in 
Prometheus, as they are critical metrics for monitoring task execution and 
completion.
   
   I suspect one of these issues might be occurring:
   
   1. The metrics might be named or formatted differently in the OpenTelemetry 
implementation compared to how the filter is set up
   2. There could be an issue in the OpenTelemetry metrics system where these 
specific counters aren't being properly sent to the collector
   3. The metrics might be filtered at some level that isn't immediately 
obvious in the configuration
   
   ### How to reproduce
   
   1. Configure Airflow 2.9.0 with OpenTelemetry using these settings:
   
   ```
   AIRFLOW__METRICS__OTEL_ON=True
   AIRFLOW__METRICS__STATSD_ON=False
   AIRFLOW__METRICS__OTEL_HOST=otel-collector
   AIRFLOW__METRICS__OTEL_PORT=4318
   AIRFLOW__METRICS__OTEL_EXPORTER_OTLP_ENDPOINT=http://otel-collector:4318
   AIRFLOW__METRICS__OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf
   AIRFLOW__METRICS__OTEL_RESOURCE_ATTRIBUTES=service.name=airflow
   AIRFLOW__METRICS__OTEL_METRICS_EXPORTER=otlp
   AIRFLOW__METRICS__METRICS_USE_PATTERN_MATCH=True
   AIRFLOW__METRICS__METRICS_OTEL_SSL_ACTIVE=False
   AIRFLOW__METRICS__OTEL_DEBUGGING_ON=True
   AIRFLOW__METRICS__METRICS_ALLOW_LIST=ti|task
   ```
   
   2. Configure an OpenTelemetry collector with a filter to allow task metrics:
   ```
   processors:
     filter:
       metrics:
         include:
           match_type: regexp
           metric_names:
             - "task_instance.*"
             - "airflow.*task.*state.*"
             - "airflow.*task.*status.*"
   ```
   
   3. Run tasks in Airflow
   4. Check Prometheus metrics - you'll see airflow_task_instance_created_* 
metrics but no ti.start or ti.finish metrics
   
   
   ### Operating System
   
   mac-OS
   
   ### Versions of Apache Airflow Providers
   
   
apache-airflow[postgres,google_auth,crypto,celery,amazon,cncf.kubernetes,statsd,pandas]
 +otel
   
   ### Deployment
   
   Docker-Compose
   
   ### Deployment details
   
   Using Docker Compose with Airflow 2.9.0, OpenTelemetry Collector, 
Prometheus, and Grafana. The OpenTelemetry collector is configured to receive 
metrics via OTLP HTTP and export them to Prometheus.
   
   ### Anything else?
   
   I've tried various filter configurations and have confirmed that other 
metrics from Airflow are being properly sent to Prometheus, just not the task 
instance state metrics. I've also enabled debugging on the OpenTelemetry 
collector and don't see these specific metrics in the debug output.
   I added custom debug logging to the _run_raw_task function to confirm that 
the code is being executed and the Stats.incr calls are being made, but these 
metrics still don't appear in the collector or Prometheus.
   
   ### Are you willing to submit PR?
   
   - [x] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [x] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to