xBis7 commented on PR #61897: URL: https://github.com/apache/airflow/pull/61897#issuecomment-3941668232
I'm not really fond of creating synthetic spans because you have to deterministically generate the `trace_id` and the `span_id`. It's similar to what the original OTel approach was doing where it was generating all spans at the end of the dag_run. In order to create spans from under tasks, you had to regenerate the `trace_id` and the `span_id` and then make the association. It was an unconventional way to use OTel and it led down the rabbit hole of having to duplicate logic of the OTel SDK and API in the `otel_tracer`, make the calculations upfront and then inject the results, instead of letting OTel handle its own mechanics. I've cleaned up a large part of it but there are still leftovers like https://github.com/astronomer/airflow/blob/add-spans-for-task-execution/shared/observability/src/airflow_shared/observability/traces/otel_tracer.py#L374 Regarding the dag_id and the task_id in the span names, it doesn't seem that verbose or bad. Jaeger search shows only root spans when searching for all operations. If the name is generic, then we will only see something like `create_dagrun` on repeat. It's easier to pick which trace to examine if the name is distinguishable. <img width="2038" height="1494" alt="image" src="https://github.com/user-attachments/assets/387089a8-3dfd-4d92-907d-5ca9a33e5d6e" /> There were spans from a lot of internal methods that were very confusing to users. I added this flag https://github.com/astronomer/airflow/blob/add-spans-for-task-execution/airflow-core/src/airflow/config_templates/config.yml#L1413-L1419 for enabling/disabling them. We can follow the same approach for the new spans in this patch. And then we can explore sampling in a later PR. The cleanest and most intuitive approach is to only show dag_run, task and user-defined spans. Everything else should be optional, in case there is a more specific usage, e.g. a developer. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
