luke-goddard commented on issue #36920:
URL: https://github.com/apache/airflow/issues/36920#issuecomment-2034802986
Upgrading from v2.7.2 -> v2.8.4 did not resolve the issue for me, although
the 'DAG scheduling was skipped, ...' error is gone now.
Strangely the issue is only occurring on our production Azure AKS cluster
(k8, helm, k8-executor). We have a staging/testing airflow instance with a
similar configuration and infrastructure as code deployed on a VM using
microk8s. Neither our staging/testing instances have this issue.
Would love to help out on this issue, but not sure where to start.
When running the following while the tasks are stuck in scheduled gave no
results.
```
select pid, state, usename, query, query_start, client_addr, client_port,
wait_event, wait_event_type
from pg_stat_activity
where pid in (
select pid from pg_locks l
join pg_class t on l.relation = t.oid
and t.relkind = 'r'
where t.relname = 'task_instance'
--and state = 'idle in transaction'
);
```
Restarting the scheduler did work. Guess, a workaround would be to create a
k8 job to periodically restart the scheduler pod.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]