MichaelRBlack opened a new pull request, #63959:
URL: https://github.com/apache/airflow/pull/63959
## Backport of #44346 to v2-11-stable
This is a backport of #44346 (merged to `main` Nov 2024) to the v2-11
maintenance branch. The fix was never cherry-picked to v2.
### Problem
`OTLPMetricExporter` and `OTLPSpanExporter` in `otel_logger.py` and
`otel_tracer.py` have a hardcoded `headers={"Content-Type":
"application/json"}` parameter. These exporters serialize data as **protobuf**
and automatically set `Content-Type: application/x-protobuf`. The hardcoded
override tells the OpenTelemetry Collector to decode the payload as JSON, but
the bytes are protobuf — causing **100% export failure**:
```
Failed to export metrics batch code: 500,
reason: {"code": 13, "message": "failed to marshal error message"}
```
This means OTEL metrics and traces are **completely broken** for every
Airflow 2.x user sending to a standard OTEL Collector.
As a secondary issue, the hardcoded `headers` parameter also prevents users
from configuring custom headers via the standard `OTEL_EXPORTER_OTLP_HEADERS`
environment variable (e.g., for authentication with hosted backends like
Grafana Cloud or Logfire).
### Fix
Remove the `headers={"Content-Type": "application/json"}` parameter from
both `OTLPMetricExporter` and `OTLPSpanExporter`, allowing the SDK to use its
correct default (`application/x-protobuf`).
### Testing
Verified on a production Airflow 2.11.1 cluster sending to an OpenTelemetry
Collector → Mimir pipeline:
- **Before fix**: every 30s export batch fails with HTTP 500 `"failed to
marshal error message"`
- **After fix**: zero export errors, metrics immediately visible in Mimir
```python
# Reproducer — run from an Airflow pod:
from opentelemetry.exporter.otlp.proto.http.metric_exporter import
OTLPMetricExporter
good = OTLPMetricExporter(endpoint=endpoint)
bad = OTLPMetricExporter(endpoint=endpoint, headers={"Content-Type":
"application/json"})
good.export(metrics_data) # SUCCESS
bad.export(metrics_data) # FAILURE — 500 "failed to marshal error message"
```
### Justification for v2 backport
This is a critical bug fix — OTEL metrics and traces are entirely
non-functional in every Airflow 2.x release. With Airflow 2.x EOL approaching
(April 2026), this fix would allow the remaining v2 user base to use OTEL
monitoring for the remainder of the support window.
^ This diffance is:
- `airflow/metrics/otel_logger.py`: 1 line removed
- `airflow/traces/otel_tracer.py`: 1 line changed
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]