hkc-8010 opened a new pull request, #66696:
URL: https://github.com/apache/airflow/pull/66696

   ## Summary
   
   - `datetime_range_filter_factory` wrapped `start_date` and `end_date` 
columns in
     `COALESCE(column, now())` to treat currently-running tasks (NULL date) as 
"now".
     This form prevents PostgreSQL from using btree indexes, causing parallel 
sequential
     scans across the full `task_instance` table even for narrow date windows.
   - Replace with explicit OR conditions that are semantically equivalent:
     `(col >= X) OR (col IS NULL AND now() >= X)`
     PostgreSQL can use a btree index on each OR branch via BitmapOr. Running 
tasks
     (NULL `end_date` / `start_date`) continue to match date-range queries 
correctly.
   - Adds `NullableDatetimeRangeFilter` subclass of `RangeFilter` for the new 
logic.
     `datetime_range_filter_factory` returns this subclass for `start_date` / 
`end_date`
     and a plain `RangeFilter` for all other filter names.
   
   ## Performance
   
   Measured against a 4.6M-row `task_instance` table with an `end_date` btree 
index:
   
   | | Plan | Cost (first 1000 rows) |
   |---|---|---|
   | Before | Parallel Index Scan + Gather Merge (COALESCE wraps column) | 
19,523 |
   | After | Single Index Scan, stops early via LIMIT (OR branches) | 2,658 |
   
   **7.4x reduction in query cost.**
   
   Note: vanilla Airflow does not ship an `end_date` index by default. The OR 
form is
   the correct architectural fix regardless. Any deployment that adds an 
`end_date` index
   (via a migration) will immediately benefit. A follow-up migration to add the 
index is
   recommended but is outside the scope of this PR.
   
   ## Changes
   
   - `airflow-core/src/airflow/api_fastapi/common/parameters.py`: add
     `NullableDatetimeRangeFilter`; update `datetime_range_filter_factory`
   - `airflow-core/tests/unit/api_fastapi/common/test_parameters.py`: add
     `TestDatetimeRangeFilterFactory` (7 tests covering type dispatch, SQL 
shape,
     no-COALESCE assertion, and NULL-branch presence)
   
   ## PR Checklist
   
   - [x] My PR is targeted at the `main` branch
   - [x] Tests added (31 pass locally and in Breeze core-tests, Python 3.10 / 
SQLite)
   - [x] `prek` hooks all pass (ruff, ruff-format, mypy)
   - [ ] Newsfragment -- will add after PR number is assigned
   
   closes: #66335


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to