potiuk edited a comment on pull request #20894:
URL: https://github.com/apache/airflow/pull/20894#issuecomment-1019538985


   Yeah. I can confirm that this change has not changed anything, as I 
suspected @ephraimbuddy . You can test it yourself. This is what I did (latest 
breeze  + postgres)
   
   ```
   export  AIRFLOW__CORE__SQL_ALCHEMY_ENGINE_ARGS='{"echo": true}'
   ```
   ^^ This enables echo of the queries generated by SQLAlchemy.
   
   ```
   pytest 
tests/jobs/test_scheduler_job.py::TestSchedulerJob::test_dagrun_callbacks_are_called
 -s
   ```
   ^^ This is one of the tests that reaches this query.
   
   ```
   UPDATE task_instance SET start_date=%(start_date)s, end_date=%(end_date)s, 
duration=%(duration)s, state=%(state)s
   WHERE task_instance.dag_id = %(dag_id_1)s AND task_instance.run_id = 
%(run_id_1)s AND task_instance.task_id
   IN (%(task_id_1)s)
   
   "{'start_date': datetime.datetime(2022, 1, 23, 18, 1, 14, 880695, 
tzinfo=Timezone('UTC')),
   'end_date': datetime.datetime(2022, 1, 23, 18, 1, 14, 880701, 
tzinfo=Timezone('UTC')),
   'duration': 0, 'state': <TaskInstanceState.SUCCESS: 'success'>,
   'dag_id_1': 'test_task_start_date_scheduling',
   'run_id_1': 'scheduled__2016-01-01T00:00:00+00:00', 'task_id_1': 'dummy2'}"
   ```
   ^^ This was the query generated (as you can see there is no extra lock 
here). To make it easy to find I just added 'raise Exception` right after the 
query - this one was the last query printed.
   
   I reverted the commit "Obtain lock for update when scheduling tis" and 
re-run it:
   
   ```
   UPDATE task_instance SET start_date=%(start_date)s, end_date=%(end_date)s, 
duration=%(duration)s, state=%(state)s
   WHERE task_instance.dag_id = %(dag_id_1)s AND task_instance.run_id = 
%(run_id_1)s AND task_instance.task_id 
   IN (%(task_id_1)s)
   
   {'start_date': datetime.datetime(2022, 1, 23, 18, 11, 22, 12081, 
tzinfo=Timezone('UTC')), 
   'end_date': datetime.datetime(2022, 1, 23, 18, 11, 22, 12087, 
tzinfo=Timezone('UTC')), 
   'duration': 0, 'state': <TaskInstanceState.SUCCESS: 'success'>, 
   'dag_id_1': 'test_task_start_date_scheduling', 
   'run_id_1': 'scheduled__2016-01-01T00:00:00+00:00', 'task_id_1': 'dummy2'}
   ```
   ^^ As you can see, we have the very same query generated by SQLAlchemy, no 
matter if we do the query with/without row lock. Exactly as I suspected - 
sqlalchemy will simply ignore the "with_for_update". I will re-open the 
original error, because IMHO, this change has no chance of fixing it, because 
it does not change anything :(. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to