zach-overflow commented on issue #43176:
URL: https://github.com/apache/airflow/issues/43176#issuecomment-2706207303

   @Dev-iL Thank you for pointing me towards the DagBag unit testing page, I 
wasn't aware that was an available approach. You're correct that the DagBag 
unit testing would cover unique DAG ID enforcement. 
   
   As for checking against user-defined cluster policies, I _think_ the DagBag 
unit testing could cover that, but the [cluster policy 
docs](https://airflow.apache.org/docs/apache-airflow/stable/administration-and-deployment/cluster-policies.html#how-do-define-a-policy-function)
 are not clear if that is evaluated during the DagBag loading, or during the 
actual DAG/task scheduling.  
   
   TL;DR: I think the most broadly useful static check would be the checks for 
problematic top-level code (e.g. network calls, `variable.get()` calls, 
expensive top-level imports, etc).
   
   > What common pattern might make a DAG definition non-serializable?
   * After my previous post, I leared that Airflow enforces DAG serialization, 
so I suppose this is less of a concern unless users are relying on some 
extended/custom serialization functionality.
   
   > In what contexts does this matter?
   * Likely not many except for some non-standard serialization, so I think 
this is less of a concern for me now.
   
   > Is this supposed to be a static or a DagBag-related test? If static - how 
can this be done without trying to serialize the DAG and seeing if it works?
   * I guess the best way would to simply run static unit tests against any 
custom serialization methods
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to