On Wed, 27 Apr 2022 at 20:28, Jarek Potiuk <[email protected]> wrote:
> That looks like a nice improvement.

I did some hacking and realized that for the time being, there's no
good way to implement this as originally proposed.

However, we can do something about the situation nonetheless.

What we can do is to detect and report when a DAG is dropped during
loading. That is, if a DAG instance is created, but not used (either
exposed at the top-level or used indirectly, for example in a
SubDagOperator), we can log a message as well as report it as an
import error (i.e. `DagBag.import_errors`).

I think it is fair to call this an import error because if you didn't
want to expose the DAG, then why create it?

That said, there are two possible ways to implement this:

1. The easiest is simply to record the DAGs that have been created
during loading (or rather, the DAG ids) and subtract them from the
top-level ones during processing to see if there are some that are not
accounted for. We need to add a special provision in SubDagOperator to
avoid having it reported as an error here.

2. The more elegant way is to untangle the reference cycle between DAG
and BaseOperator such that dropped DAGs are actually dropped
(finalized).

I have experimented with both and the only issue besides additional
weakref-based code and logic in (2) is that we have quite a few tests
which use the following pattern:

with DAG(...):
   op1 = ...
   op2 = ...

The DAG instance here isn't assigned to the local scope, but each of
the tasks directly reference it. The changes required in (2) mean that
tasks (and task groups) would now weakly reference the DAG and so
actually, the DAG would be finalized immediately after leaving the
with construct.

The fix is simply to add "as dag" in these tests (it's about a dozen
of them all in all).

I think reference cycles are objectively bad, but there are some
non-trivial changes to the code in order to make it work. For example,
there is some deep-copying action happening where we need to handle
the weakrefs carefully and the pickling logic also requires some
updating.

Option (1) is much simpler, but less elegant.

Cheers

Reply via email to