1fanwang commented on code in PR #67043:
URL: https://github.com/apache/airflow/pull/67043#discussion_r3359099494


##########
airflow-core/tests/unit/jobs/test_scheduler_job.py:
##########
@@ -914,6 +914,69 @@ def 
test_process_executor_events_multiple_try_numbers_warns(
         mock_task_callback.assert_not_called()
         mock_stats.incr.assert_not_called()
 
+    def test_process_executor_events_queued_updates_without_row_lock(self, 
dag_maker, session):
+        dag_id = "test_process_executor_events_queued_updates_without_row_lock"
+        executor_id = "queued_executor_id"
+
+        with dag_maker(dag_id=dag_id, fileloc="/test_path1/"):
+            task = EmptyOperator(task_id="dummy_task")
+        ti = dag_maker.create_dagrun().get_task_instance(task.task_id)
+        ti.state = State.QUEUED
+        session.merge(ti)
+        session.commit()
+
+        executor = MockExecutor(do_update=False)
+        scheduler_job = Job()
+        self.job_runner = SchedulerJobRunner(scheduler_job, 
executors=[executor])
+
+        executor.event_buffer[ti.key] = State.QUEUED, executor_id
+
+        with mock.patch(
+            "airflow.jobs.scheduler_job_runner.with_row_locks", 
wraps=with_row_locks
+        ) as row_locks:
+            self.job_runner._process_executor_events(executor=executor, 
session=session)
+
+        ti.refresh_from_db(session=session)
+        assert ti.external_executor_id == executor_id
+        assert ti.key not in executor.event_buffer
+        row_locks.assert_not_called()

Review Comment:
   Following up on the before/after I asked for — I ran it, and it argues 
*against* attaching one. Two threads on a `task_instance`-shaped table; the 
heartbeat updates the same rows in the opposite order:
   
   ```
   BEFORE (SELECT … FOR UPDATE)
     run 1-5: sched=[1213 Deadlock found]  hb=[ok]      5/5 deadlock
   AFTER (direct UPDATE, this PR)
     run 1:   sched=[1213…]  hb=[ok]
     run 2:   sched=[ok]     hb=[1213…]
     run 3:   sched=[ok]     hb=[1213…]
     run 4:   sched=[1213…]  hb=[ok]
     run 5:   sched=[ok]     hb=[1213…]                 5/5 deadlock
   ```
   
   InnoDB takes the same exclusive row locks for a plain `UPDATE` as for 
`SELECT … FOR UPDATE`, so two transactions crossing on the rows still cycle 
(AFTER just alternates the victim). The safety here isn't the lock type — it's 
that only the dispatching scheduler writes `external_executor_id` for a given 
`(TI, try_number)`, so nothing crosses it in practice. That single-writer 
invariant is the load-bearing piece worth confirming (no heartbeat/adoption 
path writing `external_executor_id` on a queued row), more than a deadlock 
trace.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to