ferruzzi opened a new issue, #68374:
URL: https://github.com/apache/airflow/issues/68374

   While adding a test to test_scheduler_job.py I noticed we have a test 
isolation issue that should eventually get cleaned up, but this is going to be 
a big project. 
   
   Let's look at 
[test_find_executable_task_instances_executor_with_teams](https://github.com/apache/airflow/blob/main/airflow-core/tests/unit/jobs/test_scheduler_job.py#L1563)
 for example.  It starts with:
   
   ```python
           clear_db_teams()
           clear_db_dag_bundles()
   ```
   
   This is because _some other_ test are leaving artifacts behind.  This is 
defensive coding as a stopgap, and we should address this correctly.  A quick 
grep looks like there are over 40 tests that start with this pattern. 
   
   ## The Issue:
   
   It looks like `dag_maker.cleanup()` deletes `DagModel`, `DagRun`, 
`TaskInstance`, `DagVersion`, `XCom`, `TaskMap`, and `AssetEvent`, but does 
*not* delete the `DagBundleModel(s)` it creates.  Similarly, the `testing_team` 
fixture creates a `Team` row and never tears it down.
   
   It looks like `testing_dag_bundle` was created for tests that bypass 
`dag_maker` and use `DagBag` directly (ie `TestSchedulerJobQueriesCount`) which 
are also leaving behind their Bundles.
   
   ## Proposed Fix:
   
     1. Add bundle tracking to `dag_maker`; maybe a set() that stored the 
bundle_names it creates.
     2. Delete bundles in `dag_maker.cleanup()` after `DagModel` deletion 
(FK-safe order) by iterating over that set().
     3. Add teardown to `testing_team` by changing it to a `yield` then delete 
on return.
     4. Migrate tests that bypass `dag_maker` (ie 
`TestSchedulerJobQueriesCount` which uses `DagBag` directly) to use `dag_maker` 
for setup.
     5. Remove `testing_dag_bundle` once all tests go through `dag_maker`.
     6. Remove all of the `clear_db_teams()` / `clear_db_dag_bundles()` 
pre-cleaning from tests.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to