yuqian90 commented on issue #10790:
URL: https://github.com/apache/airflow/issues/10790#issuecomment-694797430


   > After digging further, I think the slowness that causes the error for our 
case is in this function: `SchedulerJob._process_dags()`. If this function 
takes around 60s, those `reschedule` sensors will hit the `ERROR - Executor 
reports task instance ... killed externally?` error. My previous comment about 
adding the `time.sleep(30)` is just one way to replicate this issue. Anything 
that causes `_process_dags()` to slow down should be able to replicate this 
error.
   
   Some further investigation shows that the slow down that caused this issue 
for our case (Airflow 1.10.12) was in `SchedulerJob._process_task_instances`. 
This is periodically called in the `DagFileProcessor` process spawned by the 
airflow scheduler. Anything that causes this function to take more than 60s 
seems to cause these `ERROR - Executor reports task instance ... killed 
externally?` errors for sensors in `reschedule` mode with `poke_interval` of 
60s. I'm trying to address one of the cause of the 
`SchedulerJob._process_task_instances` slowdown for our own case here #11010, 
but that's not a fix for the other causes of this same error. 
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to