george-zubrienko commented on issue #17507:
URL: https://github.com/apache/airflow/issues/17507#issuecomment-917364708


   > cf [#18041 
(comment)](https://github.com/apache/airflow/issues/18041#issuecomment-915383493)
   > 
   > For me for the moment the pb is on backfill mode. tonight my processes 
will run on this new airflow session I will see If I get the same errors.
   > 
   > UPDATE: I have modified the param `scheduler_heartbeat_sec` to 60 sec 
instead of 5 sec and it is better: so maybe a problem of performance in the 
backend (postgresql) creates this issue ....
   > @kiwy42 what backend are u using to store the task instances ?
   
   In our case with Kubernetes Executor it definitely seems scheduler related. 
In a DAG with 55 tasks, around a third receives sigterm shortly after starting 
and then goes into a retry loop with Pid X does not match Pid Y. It was fixed 
after I reduced pool size from 128 (all tasks queued at the same time) to 32, 
so 23 tasks were left in scheduled state. After I reverted the pool change, 
issue came back


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to