I think the source of the confusion you're experiencing is that the UI is
based off of the DAG definition file at time of webserver load, which I
believe is on the one hand defensible since the scheduler operates in a
somewhat similar way, but on the other hand rightfully confusing (and
doesn't TOTALLY mimic scheduler activities so it's really the worst of both
worlds IMHO). When you change the DAG definition file, you have to kick the
webserver to pick up new graph/tree drawings. In the case of removing
tasks, I think that if you queried the underlying metadata database
directly, your 'table2' task instances would still exist, but the UI
doesn't know that it should show them based on the DAG definition files it
has on hand during webserver process reload.

I have non-dynamic DAGs that when the DAG shape is changed dramatically by
me (including removing tasks) I usually create an entirely new DAG (in
practice this is changing the dag_id of the DAG object in the DAG
definition file, for example 'my_dag' becomes 'my_dag_v2') so that there is
no confusion of it being tied to previous history. If you choose to keep
your previous DAG definition file ('my_dag') but have the scheduler for
that DAG in the off state, and add in the new DAG in the on state
('my_dag_v2') the UI will render both as different DAGs and you can
navigate through history with the UI as normal.

This has been discussed as the preferred workaround for several different
types of major DAG configuration changes (such as a start_date further in
the past than the original version was configured for), but I'm not sure if
anything has been going on (yet?) to redesign it in any way. I believe as
mentioned it is basically a side effect of depending on the DAG definition
files to draw graphs and trees as opposed to history.

Sort of an aside but relevant if you are changing DAG shape with any
frequency: We also see that when we add tasks to an existing DAG, what I
will see is that the tree view/graph view will fill in the added task for
all of history with the state 'no status'. If that DAG is set to have
depends_on_past=True, this will actually clog up a new DagRun unless I do
something to force the new task in the new DagRun that has no previous task
instance history to execute regardless.

On Sun, Oct 16, 2016 at 7:04 PM, Boris Tyukin <bo...@boristyukin.com> wrote:

> I opened a JIRA - looks like based on comments in other threads, it does
> not work properly right now.
> [AIRFLOW-574] Show Graph/Tree view and Task Instance logs using executed
> DagRun, not current

Reply via email to