PaulW edited a comment on issue #4636: [AIRFLOW-3737] Kubernetes executor cannot handle long dag/task names URL: https://github.com/apache/airflow/pull/4636#issuecomment-466396697 So the reason behind hashing vs slug is that within kubernetes, you can't query pods using annotations, and as such need to rely on labels. Within the `clear_not_launched_queued_tasks` function within `kubernetes_executor.py` a query is sent to kubernetes: https://github.com/apache/airflow/blob/9a2d998f57b48bcfe07f16a0563293a13141b60e/airflow/contrib/executors/kubernetes_executor.py#L563 This returns a set result, which would previously return pods matching the `dag_id` and `task_id` as per the running task within airflow. However, if (as is the case) this string is above 63 chars, no pods will be returned (as they simply wouldn't have existed to begin with due to this constraint). Simply truncating the values would lead to issues in regards to subdags or long named dags/tasks, as you could return multiple pods matching the truncated name (especially in the case of subdag execution) and as such would cause further issues and require more calls to kubernetes to then process this list of multiple pods. If you truncate & slugify the names, you can still hit this condition where multiple pods can be returned. Hashing the entire `dag_id` and `task_id` as labels, and storing them in their entirety as annotations, allows the query to kubernetes to return just the one specific pod relating to the dag/task at hand.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
