andreahlert opened a new pull request, #62811:
URL: https://github.com/apache/airflow/pull/62811

   Fix flaky OTEL integration tests by disabling OTLP metric export in 
span-only tests and adding an OTEL collector healthcheck so Airflow starts only 
after the collector is ready.
   
   **Root cause:** Docker Compose sets `OTEL_METRICS_EXPORTER=otlp` and 
`AIRFLOW__METRICS__OTEL_ON=True` for the Airflow container. The test 
`setup_class` overrides the trace exporter to `console` but never disables the 
metric exporter. Scheduler and Celery worker subprocesses therefore keep trying 
to push metrics to `breeze-otel-collector:4318`. When DNS resolution fails 
transiently, the metric exporter retries block or delay processes and spans may 
not be flushed to stdout before assertions run, causing failures like "Span 
name 'task2' wasn't found" or "Span name 'task1_sub_span3' wasn't found".
   
   **Changes:**
   1. **Test setup** (`test_otel.py`): When `use_otel != "true"`, set 
`OTEL_METRICS_EXPORTER=none` and `AIRFLOW__METRICS__OTEL_ON=False` in 
`setup_class` so span-only tests do not export metrics to the collector. Metric 
tests that need export already override these in 
`dag_execution_for_testing_metrics`.
   2. **OTEL collector config** (`otel-collector-config.yml`): Add 
`health_check` extension on `0.0.0.0:13133` and register it in 
`service.extensions`.
   3. **Docker Compose** (`integration-otel.yml`): Add a healthcheck for the 
`otel-collector` service (wget to `http://localhost:13133/`) and make the 
Airflow service depend on `otel-collector` with `condition: service_healthy` so 
Airflow starts only after the collector is ready.
   
   closes: #62769
   
   ---
   
   ##### Was generative AI tooling used to co-author this PR?
   
   - [ ] No
   
   ---
   
   * Read the **[Pull Request 
Guidelines](https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#pull-request-guidelines)**
 for more information. Note: commit author/co-author name and email in commits 
become permanently public when merged.
   * For fundamental code changes, an Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvement+Proposals))
 is needed.
   * When adding dependency, check compliance with the [ASF 3rd Party License 
Policy](https://www.apache.org/legal/resolved.html#category-x).
   * For significant user-facing changes create newsfragment: 
`{pr_number}.significant.rst` or `{issue_number}.significant.rst`, in 
[airflow-core/newsfragments](https://github.com/apache/airflow/tree/main/airflow-core/newsfragments).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to