SteNicholas opened a new issue #9321:
URL: https://github.com/apache/airflow/issues/9321


   **Description**
   
   The idea of signal based scheduling is to let the operators send signals to 
the scheduler to trigger a scheduling action, such as starting jobs, stopping 
jobs and restarting jobs. Also, compared with the current state change 
retrieval mechanism, signal based scheduling allows the scheduler to know the 
change of the dependency state immediately without periodically querying the 
metadata database. In addition to that, signal based scheduling allows 
potential support for richer scheduling semantics such as periodic execution 
and manual trigger at per operator granularity.
   
   **Use case / motivation**
   
   Airflow scheduler uses DAG definitions to monitor the state of tasks in the 
metadata database, and triggers the task instances whose dependencies have been 
met. It is based on state of dependencies scheduling. 
   However, the current design has the following caveats:
   - When the workflow contains streaming jobs, the scheduler can’t work 
because the streaming job runs forever.
   - The communication between the operator and scheduler has a long latency of 
the database query interval.
   In order to address the issues, we propose to add signal based scheduling to 
the scheduler.
   
   **Related Issues**
   
   N/A


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to