alex-astronomer opened a new issue #18410:
URL: https://github.com/apache/airflow/issues/18410


   ### Description
   
   ## Overview
   AirflowClusterPolicyViolation behaves in very strange ways right now.  I 
think that this would be an important feature to overhaul for users that have 
complex DAG definition requirements for tagging, owners, naming conventions, 
etc.  Very useful for Airflow administrators that have many application teams 
using the same deployment of Airflow.
   
   ## Current behavior 
   The AirflowClusterPolicyViolation exception when called from 
`airflow_local_settings.py` is that the DAG shows up still in the DAG list view 
(different from other import errors, where sometimes they will not show in this 
list any more).  The DAG remains paused or unpaused depending on what state it 
was in before the DAG cluster policy was deployed.  The DAG can still be 
scheduled and run, but all of the tasks within that DAG will fail with no 
errors and no logs.  This silent failure is very confusing for developers that 
don't see the import error on their DAG.
   
   ## New Expected Behavior and Overhaul
   The behavior that I expect to see when the AirflowClusterPolicyViolation is 
thrown is that the DAG will become paused, and cannot be unpaused until the DAG 
adheres to the cluster policy.  There will be a tooltip on the pause button 
that explains why the DAG cannot be unpaused.  The import error will show in 
the DAG view as well as the DAG list view, solved by 
https://github.com/apache/airflow/pull/17818.  No tasks will be scheduled or 
run, and no DAGRuns will be scheduled or run, until the cluster policy is 
adhered to.
   
   ### Use case/motivation
   
   I want to offer more support to users that have many application teams 
working on the same deployment of Airflow.  Part of data pipeline quality is 
making policies that teams are unable to violate.  If the teams adhere to the 
policy, they can run their DAG.
   
   Right now the behavior is very confusing and it's challenging to pause a DAG 
that violates a cluster policy.  It is possible to pause a DAG within the 
cluster policy function before the exception is raised, but there exists a race 
condition if the DAG is unpaused again after that.  The DAG will try to run its 
tasks between the time that the DAG is unpaused and the DAG policy function has 
time to pause it again.  This creates even more confusing behavior.
   
   I believe that a DAG should not be able to be run unless it adheres to the 
cluster policy.
   
   Very open to discussion about the best way to solve this issue.
   
   ### Related issues
   
   _No response_
   
   ### Are you willing to submit a PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to