vatsrahul1001 opened a new pull request, #67527:
URL: https://github.com/apache/airflow/pull/67527
## Summary
Triggerer crashes on the next deadline async callback when OpenTelemetry
metrics are enabled:
```
File ".../airflow/jobs/triggerer_job_runner.py", line 659, in handle_events
Trigger.submit_event(...)
File ".../airflow/models/callback.py", line 234, in handle_event
Stats.incr(**self.get_metric_info(status, self.output))
File ".../airflow/_shared/observability/metrics/otel_logger.py", line 211,
in incr
counter.add(count, attributes=tags)
File ".../opentelemetry/sdk/metrics/.../view_instrument_match.py", line 105
aggr_key = frozenset(attributes.items())
TypeError: unhashable type: 'dict'
```
## Root cause
`Callback.get_metric_info` builds the metric tags dict directly from the
callback's `result` and `self.data` (which includes `kwargs`). Both are
frequently dicts — for deadline async callbacks the `result` is the user
callback's return value, and `kwargs` is the captured callback kwargs.
When the metrics backend is OpenTelemetry, the SDK builds the aggregation
key as `frozenset(attributes.items())`, which raises `TypeError: unhashable
type: 'dict'` if any value is unhashable (dict, list, set). Result: triggerer
crash, stalled triggers.
The bug is metrics-backend-dependent: **statsd** accepts non-primitive tag
values without complaint, so OSS users running the default statsd never see it.
**OpenTelemetry** backends hit it consistently — surfaced by Astronomer Astro
Cloud, but affects any OSS deployment that enables `[metrics] otel_*`.
Reproduces against 3.2.1 and main — not a 3.2.x regression.
## Fix
Sanitize tag values to primitives before returning from `get_metric_info`.
Keep `str | int | float | bool | None` as-is; JSON-stringify anything else
using `default=str` so values like `datetime` fall back cleanly instead of
raising.
## Test plan
- [X] Adds `test_get_metric_info_dict_values_are_stringified` — exercises
the deadline-async-callback shape (dict `result` + dict `kwargs`), asserts
every tag value is primitive, and verifies `frozenset(tags.items())` doesn't
raise.
- [X] Updates the existing `test_get_metric_info` assertion: `kwargs` is now
JSON-stringified rather than left as a dict.
- [ ] Manual repro on an OTel-enabled deployment — runs the deadline test
DAG and confirms triggerer no longer crashes.
## Reported by
Astronomer Runtime team while testing the 3.2.2rc2-based Runtime image on
Astro Cloud stage env. Pre-existing in Runtime 3.2-4 (3.2.1-based).
---
##### Was generative AI tooling used to co-author this PR?
- [X] Yes — Claude Code (Opus 4.7)
Generated-by: Claude Code (Opus 4.7) following [the
guidelines](https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#gen-ai-assisted-contributions)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]