vchiapaikeo commented on issue #27296: URL: https://github.com/apache/airflow/issues/27296#issuecomment-1292782722
Adding a bit more analysis here. I'm noticing the only place where a query like this (SQL: UPDATE dag_run SET last_scheduling_decision=%s WHERE dag_run.id = %s) would be run is here in `DagRun.update_state`: https://github.com/apache/airflow/blob/2.4.2/airflow/models/dagrun.py#L516-L518 Specifically, `last_scheduling_decision` gets set here: https://github.com/apache/airflow/blob/2.4.2/airflow/models/dagrun.py#L552 And I think the most likely place that `DagRun.update_state` is being called from is here in `SchedulerJob._schedule_dag_run`: https://github.com/apache/airflow/blob/2.4.2/airflow/jobs/scheduler_job.py#L1242-L1246 https://github.com/apache/airflow/blob/2.4.2/airflow/jobs/scheduler_job.py#L1301 What I don't quite understand is if this is a call from SchedulerJob to update the dagrun, why are we seeing these logs on the worker pod? Is that because we're using KubernetesExecutor and the airflow worker pod itself is actually run as LocalExecutor? I also wonder what could be holding a lock on this same record for >50s... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
