1fanwang opened a new pull request, #66789:
URL: https://github.com/apache/airflow/pull/66789
AIP-59 added OpenTelemetry traces inside the scheduler, dag processor, and
task supervisor. The CLI entry points that drive those subsystems (`airflow
tasks test`, `airflow dags trigger`, `airflow dags test`, `airflow backfill
create`) are not wrapped in spans, so when one of these commands is invoked
from a wrapper that already has trace context — a CI pipeline, a parent
workflow, a debug harness — the inbound trace dies at the CLI binary and the
resulting task and DagRun spans show up as a separate trace. This PR wires
those four entry points into the existing tracer so the caller's trace and
Airflow's downstream spans stitch into a single distributed trace.
## Problem
`airflow tasks test ...` shelled out from a developer's terminal or `airflow
dags trigger ...` issued by an external orchestrator both create a meaningful
unit of work inside Airflow. Today there's no span to anchor that work to the
caller's trace. The caller has to manually correlate trace IDs via logs, or
accept a broken trace boundary at the CLI.
## Fix
Add a small `cli_span` context-manager helper to
`airflow_shared.observability.traces` (next to the existing AIP-59 tracer
setup). It reads `TRACEPARENT` (and optionally `TRACESTATE`) from the
environment using the W3C TraceContext propagator, opens a span parented to
that context, and yields. When the env vars are absent it produces a root span;
when OTel is not configured it falls back to the global no-op tracer and stays
inert.
Apply the helper at four CLI entry points:
- `airflow.cli.commands.task_command.task_test` → span `cli.tasks.test`
- `airflow.cli.commands.dag_command.dag_trigger` → span `cli.dags.trigger`
- `airflow.cli.commands.dag_command.dag_test` → span `cli.dags.test`
- `airflow.cli.commands.backfill_command.create_backfill` → span
`cli.backfill.create`
Each span carries `airflow.dag_id` plus the most useful context for that
command (task_id, run_id, logical_date, dry_run flag).
## Reproducer
Without this PR:
```bash
export OTEL_TRACES_EXPORTER=console
export AIRFLOW__TRACES__OTEL_ON=True
TRACEPARENT="00-0af7651916cd43dd8448eb211c80319c-b7ad6b7169203331-01" \
airflow dags test example_bash_operator
```
The console exporter prints task spans, but they're under a fresh root trace
— the `0af76519…` trace id from `TRACEPARENT` is not used. With this PR, the
same invocation emits a `cli.dags.test` span as a child of `b7ad6b7169203331`,
and the downstream task spans share the trace id from the caller.
## Tests
- `shared/observability/tests/observability/test_traces.py` — unit tests for
the helper: traceparent extraction, tracestate propagation, malformed-header
tolerance, env-var fallback, no-op-tracer safety.
- `airflow-core/tests/unit/cli/commands/test_cli_trace.py` — integration
tests for the four CLI entry points, asserting both the no-traceparent and
with-traceparent paths via an `InMemorySpanExporter`.
16 tests, all passing. The existing CLI tests for `task_test`, `dag_test`,
`dag_trigger`, and `create_backfill` still pass unchanged.
## Notes
- The helper is intentionally additive — if `otel_on` is false (the
default), the global no-op tracer kicks in and the wrapper has effectively zero
overhead.
- Only the CLI entry points that initiate work (`tasks test`, `dags
trigger`, `dags test`, `backfill create`) are wrapped. Read-only commands
(`dags list`, `tasks state`, etc.) don't need a span — they're not the kind of
operation a caller threads trace context through.
- I deliberately did not wrap the helper around the whole `action_cli`
decorator: that would emit a span for every CLI invocation including `airflow
info`, `airflow db check`, etc., which is more noise than value. The explicit
per-entry-point wiring keeps the span set meaningful.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]