Jorricks opened a new pull request #17207:
URL: https://github.com/apache/airflow/pull/17207


   **Summary of the problem:**
   Currently the celery `task_id` is being stored only after a scheduler 
launched a task in its executor.
   Then the celery `task_id` is being put on the `event_buffer` and the 
scheduler periodically reads out the `event_buffer` and stores the 
`external_executor_id`. 
   Because manually trigger tasks enter the adoption flow -- as their executor 
instances are only there for the launching of that one specific tasks -- and 
the `external_executor_id` is not set, they won't get adopted. Instead they get 
killed. Meaning, any manually triggered task that doesn't have an 
`external_executor_id` from a previous scheduled run before being launched, 
might get killed if the adoption routines kicks in while the task is still 
running.
   
   **Solution:**
   I could imagine two solution:
   1. After every manually triggered task, we read the event_buffer and store 
that in the task instances.
   2. Every task that is triggered for Celery Executors automatically stores 
its `external_executor_id` at the start up of the task.
   
   I implemented both but found the second version nicer.
   I am looking for some feedback so please provide me with any you can think 
of :)
   
   **Opened issues that are related:**
   related: https://github.com/apache/airflow/issues/16023
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to