george-zubrienko commented on issue #17507: URL: https://github.com/apache/airflow/issues/17507#issuecomment-917364708
> cf [#18041 (comment)](https://github.com/apache/airflow/issues/18041#issuecomment-915383493) > > For me for the moment the pb is on backfill mode. tonight my processes will run on this new airflow session I will see If I get the same errors. > > UPDATE: I have modified the param `scheduler_heartbeat_sec` to 60 sec instead of 5 sec and it is better: so maybe a problem of performance in the backend (postgresql) creates this issue .... > @kiwy42 what backend are u using to store the task instances ? In our case with Kubernetes Executor it definitely seems scheduler related. In a DAG with 55 tasks, around a third receives sigterm shortly after starting and then goes into a retry loop with Pid X does not match Pid Y. It was fixed after I reduced pool size from 128 (all tasks queued at the same time) to 32, so 23 tasks were left in scheduled state. After I reverted the pool change, issue came back -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
