NBardelot opened a new issue, #44620: URL: https://github.com/apache/airflow/issues/44620
### Apache Airflow version 2.10.3 ### If "Other Airflow 2 version" selected, which one? _No response_ ### What happened? Currently, the "safe_dag_id" function is not used when Airflow sends statsd metrics. The task_id is not safe neither. Thus, if you have a DAG with dag_id=`my.dag` and an Operator with task_id=`my.task` (which is OK by `KEY_REGEX = re.compile(r"^[\w.-]+$")` in `utils/helper.py` used both for DAG and BaseOperator), some metrics that involve both fields will look like: ``` some.prefix.ti.start.my.dag.my.task ``` And the statsd mapping cannot be correctly done, as it will map "my.dag.my" as the dag_id (the capture group being greedy), and "task" as the remaining task_id. ### What you think should happen instead? At least the dag_id should be safe, to that dots are replaced with `__dot__` which can be captured in a dot-separated string that contains both a dag_id and task_id with dots. ``` some.prefix.ti.start.my__DOT__dag.my.task ``` Can corretly be mapped using `([\w-]+)` (not dot) as a capture group for the dag_id, and `([\w.-]+)` as a capture group for the task_id. But, in an ideal situation, both dag_id and task_id should have their respective dots replaced with `__DOT__`. ### How to reproduce * Create a DAG with a dag_id=`my.dag` and a DummyOperator with task_id=`my.task` and export metrics using a statsd-exporter. * Add an extraMappings in the values.yaml to configure a mapping for dag_id and task_id as labels. * In the statsd-exporter, query the metric after mapping. ### Operating System Containerized (Kubernetes) ### Versions of Apache Airflow Providers _No response_ ### Deployment Official Apache Airflow Helm Chart ### Deployment details _No response_ ### Anything else? _No response_ ### Are you willing to submit PR? - [ ] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
