My current thinking is to add a field to the dag table that is optional and
provided by the dag. We currently intercept the load path do could use this
field to make sure we load the same generation. My concern here is the
interaction with the scheduler, not as familiar with that logic to predict
corner cases were this would fail.
Any other recommendations for how this could be done?
On Mon, Feb 19, 2018, 10:33 PM David Capwell <dcapw...@gmail.com> wrote:
> We have been using airflow for logic that delegates to other systems so
> inject a task all tasks depends to make sure all resources used are the
> same for all tasks in the dag. This works well for tasks that delegates to
> external systems but people are starting to need to run logic in airflow
> and the fact that scheduler and all workers can see different states is
> causing issues
> We can make sure that all the code is deployed in a consistent way but
> need help from the scheduler to tell the workers the current generation for
> a DAG.
> My question is, what would be the best way to modify airflow to allow DAGs
> to define a generation value that the scheduler could send to workers?