potiuk edited a comment on pull request #20894: URL: https://github.com/apache/airflow/pull/20894#issuecomment-1016322047
I am rather sceptical if it changes anything (unless I missed something). What this change effectively does it would attempt to run: ``` UPDATE task_instance SET state=%(state)s WHERE task_instance.dag_id = %(dag_id_1)s AND task_instance.run_id = %(run_id_1)s AND task_instance.task_id IN () FOR UPDATE;' ``` Which I think is not happening anyway because it makes no sense. The FOR UPDATE clause only makes sense in select queries. Row lock happens always when your UPDATE query, so I think sqlalchemy will simply silently ignore the whit_for_updae() call. What I think we really need to fix for this query is fix the OTHER query that causes it (but we need the logs from server to find out what the other query was). From how I understand here this piece of code should be protected by earlier locking of "DAG_RUN" row. I believe the deadlock is caused by another query that locks TASK_INSTANCE rows for that DAG_RUN - and this should not happen and we should fix the other query. There are two reasons it could happen: 1) The other query does not lock the DAG_RUN row "deliberately" 2) We have a commit somewhere that releases the DAG_RUN lock I suspected initially that it's mini-scheduler, but I think it must be something else (mini-scheduler correctly locks the DAG_RUN and I could not find any place where it would release it. One other reason is that it could be an API call or UI action that updates the task_instances. UPDATE: I thought `synchronize_session = False` might have something to do with it but I looked up the docs, and it looks it only affects updating the objects kept in the current session, the UPDATE query would run anyway the same as I understand it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
