ashb opened a new pull request, #66878:
URL: https://github.com/apache/airflow/pull/66878

   I was looking at the core scheduler logic, and the 
`_executable_task_instances_to_queued` was rather large, and hard for even me 
to understand, and past me wrote a good chunk of it!
   
   This splits it into two (plus a helper) focused methods to aid readability:
   
   - `_acquire_pool_capacity`: takes the advisory lock (lifetime of the 
session, so longer than just this fn) and reads pool utilisation via SELECT FOR 
UPDATE. Returns (pools, max_tis, starved_pools) so caller can short-circuit 
when all pools are full before doing any TI selection work.
   
   - `_select_task_instances_to_queue`: given pre-computed pool capacity, 
selects eligible SCHEDULED TIs and moves them to QUEUED. Accepts the pools dict 
and starved_pools set as parameters, making it directly testable without 
needing a real lock or DB pool read. This uses the new 
`_build_schedulable_tis_query` helper fn to build the complex query.
   
   `_critical_section_enqueue_task_instances` now calls these two methods in 
sequence, making the two-phase structure (acquire capacity, then select and 
queue) more explicit.
   
   All test call sites updated to call `_select_task_instances_to_queue` 
directly with a `make_pool_stats()` helper, removing the dependency on pool row 
locking in unit tests.
   
   No behaviour changes, just refactoring.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to