alex-astronomer opened a new issue #18410: URL: https://github.com/apache/airflow/issues/18410
### Description ## Overview AirflowClusterPolicyViolation behaves in very strange ways right now. I think that this would be an important feature to overhaul for users that have complex DAG definition requirements for tagging, owners, naming conventions, etc. Very useful for Airflow administrators that have many application teams using the same deployment of Airflow. ## Current behavior The AirflowClusterPolicyViolation exception when called from `airflow_local_settings.py` is that the DAG shows up still in the DAG list view (different from other import errors, where sometimes they will not show in this list any more). The DAG remains paused or unpaused depending on what state it was in before the DAG cluster policy was deployed. The DAG can still be scheduled and run, but all of the tasks within that DAG will fail with no errors and no logs. This silent failure is very confusing for developers that don't see the import error on their DAG. ## New Expected Behavior and Overhaul The behavior that I expect to see when the AirflowClusterPolicyViolation is thrown is that the DAG will become paused, and cannot be unpaused until the DAG adheres to the cluster policy. There will be a tooltip on the pause button that explains why the DAG cannot be unpaused. The import error will show in the DAG view as well as the DAG list view, solved by https://github.com/apache/airflow/pull/17818. No tasks will be scheduled or run, and no DAGRuns will be scheduled or run, until the cluster policy is adhered to. ### Use case/motivation I want to offer more support to users that have many application teams working on the same deployment of Airflow. Part of data pipeline quality is making policies that teams are unable to violate. If the teams adhere to the policy, they can run their DAG. Right now the behavior is very confusing and it's challenging to pause a DAG that violates a cluster policy. It is possible to pause a DAG within the cluster policy function before the exception is raised, but there exists a race condition if the DAG is unpaused again after that. The DAG will try to run its tasks between the time that the DAG is unpaused and the DAG policy function has time to pause it again. This creates even more confusing behavior. I believe that a DAG should not be able to be run unless it adheres to the cluster policy. Very open to discussion about the best way to solve this issue. ### Related issues _No response_ ### Are you willing to submit a PR? - [X] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
