Jorricks edited a comment on issue #16023:
URL: https://github.com/apache/airflow/issues/16023#issuecomment-876729124


   I did some checks. I think it might be related to the fact that when running 
through the webserver, we never set the `external_executor_id` value. In a 
Scheduler run, this is set after a the task is queued, at which point the 
scheduler will read the event_buffer and set the `external_executor_id` based 
on that. However on the webserver side, this loop isn't present and 
`external_executor_id` isn't set, meaning when another scheduler picks it up, 
it dies.
   
   We should **not** see this behaviour for tasks that are first started by the 
Scheduler and then (re-)triggered manually.
   
   Code fragment of where we set it when tasks is started by the Scheduler.
   
   
https://github.com/apache/airflow/blob/db6acd9e8a91e0eca9e12cace72edc57b2667d25/airflow/jobs/scheduler_job.py#L596-L599
  
   
   
   To verify, I'm interested in these questions as well:
   1. what do you have for your 
[worker_refresh_interval](https://airflow.apache.org/docs/apache-airflow/stable/configurations-ref.html#worker-refresh-interval)
 in the airflow.cfg?
   2. Is it possible you are restarting your webserver frequently or after you 
started the task?
   3. To get to the root cause of the issue (and fix it) it would help if you 
could provide DEBUG logs of the scheduler while reproducing the issue. Could 
you please be so kind to provide these when you have the time?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to