Tony Brookes created AIRFLOW-4540:
-------------------------------------

             Summary: Allow historic DAG runs to be rendered in the UI based on 
what the database says they did, not the current DAG structure.
                 Key: AIRFLOW-4540
                 URL: https://issues.apache.org/jira/browse/AIRFLOW-4540
             Project: Apache Airflow
          Issue Type: Improvement
          Components: DAG, ui
            Reporter: Tony Brookes


Dags evolve over time.  Their structure changes.  Indeed because they are 
dynamically created in code they can change even if the code remains the same 
based on external factors which the code itself reacts to (generating different 
tasks.). All part of the wonderful advantages of the Airflow approach to 
generating Dags.

However, when you look at prior runs of the Dags in the UI, the rendered graph 
is always based on evaluating the Dag code "right now."  So if the Dags have 
changed, or external factors have changed, then the graph can look nothing like 
it did when it was actually run.

For example, we have evolved our Dags based on experience, changing names to be 
more meaningful, adding and removing operators, moving from hard coded 
generation to templates and dynamic generation of parallel tasks via code etc.  
When you open older Dag runs in the UI, you see all sorts of strangeness where 
tasks which you no longer have simply vanish and tasks you have added show up 
(with their predecessor and successor links if they're there) which can make it 
look like a downstream task triggered even though it's upstream parent never 
ran.  Quite confusing, especially when trying to debug problems.

I would love the ability to see what a complete Dag _*actually d**id*_.  
Meaning, based on the data in the database, generate the graph based on those 
entries, completely irrespective of what the current Dag code says it should 
look like.  To fully support this might require some additional columns such as 
storing what the name of the operator class was etc, and perhaps the 
predecessor and successor task IDs.

But from a production support standpoint, it would be incredibly valuable to 
see the "as was" view of historic Dag execution rather than the current "as is" 
view.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to