Hi All,

Now that we have an API in place. I would like to propose a new state for tasks 
named “WAITING_ON_CALLBACK”. Currently, we have tasks that have a kind of 
polling mechanism (ie. Sensors) that wait for an action to happen and check if 
that action happened by regularly polling a particular backend. This will 
always use a slot from one of the workers and could starve an airflow cluster 
for resources. What if a callback to Airflow could happen that task to change 
its status by calling a callback mechanism without taking up a worker slot. A 
timeout could (should) be associated with the required callback so that the 
task can fail if required. So a bit more visual:


Task X from DAG Z  does some work and sets “WAITING_ON_CALLBACK” -> API post to 
/dags/Z/dag_runs/20170101T00:00:00/tasks/X with payload “set status to SUCCESS”

DAG Z happily continues.

Or

Task X from DAG Z sets “WAITING_ON_CALLBACK” with timeout of 300s -> time 
passes -> scheduler sets task to FAILED.


Any thoughts?

- Bolke

Reply via email to