On Wed, 27 Apr 2022 at 20:28, Jarek Potiuk <[email protected]> wrote: > That looks like a nice improvement.
I did some hacking and realized that for the time being, there's no good way to implement this as originally proposed. However, we can do something about the situation nonetheless. What we can do is to detect and report when a DAG is dropped during loading. That is, if a DAG instance is created, but not used (either exposed at the top-level or used indirectly, for example in a SubDagOperator), we can log a message as well as report it as an import error (i.e. `DagBag.import_errors`). I think it is fair to call this an import error because if you didn't want to expose the DAG, then why create it? That said, there are two possible ways to implement this: 1. The easiest is simply to record the DAGs that have been created during loading (or rather, the DAG ids) and subtract them from the top-level ones during processing to see if there are some that are not accounted for. We need to add a special provision in SubDagOperator to avoid having it reported as an error here. 2. The more elegant way is to untangle the reference cycle between DAG and BaseOperator such that dropped DAGs are actually dropped (finalized). I have experimented with both and the only issue besides additional weakref-based code and logic in (2) is that we have quite a few tests which use the following pattern: with DAG(...): op1 = ... op2 = ... The DAG instance here isn't assigned to the local scope, but each of the tasks directly reference it. The changes required in (2) mean that tasks (and task groups) would now weakly reference the DAG and so actually, the DAG would be finalized immediately after leaving the with construct. The fix is simply to add "as dag" in these tests (it's about a dozen of them all in all). I think reference cycles are objectively bad, but there are some non-trivial changes to the code in order to make it work. For example, there is some deep-copying action happening where we need to handle the weakrefs carefully and the pickling logic also requires some updating. Option (1) is much simpler, but less elegant. Cheers
