chinwobble commented on issue #18999:
URL: https://github.com/apache/airflow/issues/18999#issuecomment-958449899
We are using localExecutor and it seems like the deferral consumes an
unnecessary amount of tasks.
When implementing deferral, it looks like this is what is happening.
The way deferral work is like this:
1. schedulerJob decides a task is ready to be queued
2. Queued jobs run on a local executor and start a new job consuming 1
worker slot.
3. The execute method is started, it takes 1 sec to submit the job to
databricks
4. self.defer() is run and queues a trigger.
5. This throws an exception and causes the task job to terminate and the
worker slot is returned to the pool
6. The triggerer is a separate process that can asynchronously process
triggers and determine when to a resume a task
7. The trigger yields a result causing the scheduler to requeue the
original databricks task possibly on a different method
8. A new executor worker process (second job start) is started (consuming 1
worker slot), when complete it marks the task as succeeded or failed
9. The task is complete and all workers are returned to the pool.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]