One of the selling points for me to use airflow for our new project is an
ability to create tasks programmatically on every run. People mentioned
that in various talks then they would generate tasks on every run, pulling
a list of files or using some external configs (YAML) etc.
I also found this example https://gist.github.com/mtustin-handy/
I created a quick dag similar to above and I am observing some weird issues
(using Airflow v18.104.22.168 and Sequential Executor).
My Dag has a list like
tables_list = ['table1','table2']
Then i would create a first task (dummy) and then generate bash operators
for every table in a list and use first dummy task as upstream task,
It works great on a first run - all tasks created properly.
Then I change the list to add a new table3:
tables_list = ['table1','table2','table3']
DAG runs again but I do not see table3 in the Graph or Tree view. I do see
table3 task under Task Instance View so it was generated. But if I click on
it, i would get an error like Task [dynamic_job_proto_v1.t_table3] doesn't
seem to exist at the moment
Then I restarted the scheduler - same thing. New Dags would not show that
then I restarted airflow webserver - this time I was able to see table3
task in views.
After that I removed table2 from my list and DAG ran again - same issue.
Table2 was still in views untill i restarted the webserver. After the
restart, table2 dissappered from previosly ran Dags which is bad because
now i cannot go back in history, cannot compare execution time etc.
Is this a known bug?