PaulW commented on issue #4636: [AIRFLOW-3737] Kubernetes executor cannot handle long dag/task names URL: https://github.com/apache/airflow/pull/4636#issuecomment-466396697 So the reason behind hashing vs slug is that within kubernetes, you can't query pods using annotations, and as such need to rely on labels. Within the `clear_not_launched_queued_tasks` function within `kubernetes_executor.py` a query is sent to kubernetes: https://github.com/apache/airflow/blob/9a2d998f57b48bcfe07f16a0563293a13141b60e/airflow/contrib/executors/kubernetes_executor.py#L563 This returns a set result, which would previously return pods matching the `dag_id` and `task_id` as per the running task within airflow. However, if (as is the case) this string is above 63 chars, no pods will be returned, and if something is running, it will be unknown to airflow due to this. Simply truncating the values would lead to issues in regards to subdags or long named dags/tasks, as you could return multiple pods matching the truncated name, and could lead to further issues. If you truncate & slugify the names, you can still hit this condition. Hashing the entire `dag_id` and `task_id` as labels, but then storing them as annotation which are then returned when a hash is matched & a single pod instance is returned overcomes this in its entirety.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
