hkc-8010 commented on code in PR #66696:
URL: https://github.com/apache/airflow/pull/66696#discussion_r3218186752
##########
airflow-core/src/airflow/api_fastapi/common/parameters.py:
##########
@@ -898,6 +898,35 @@ def is_active(self) -> bool:
)
+class NullableDatetimeRangeFilter(RangeFilter):
+ """
+ RangeFilter for nullable datetime columns that preserves NULL-as-now
semantics.
+
+ Uses explicit OR conditions instead of COALESCE(column, now()) so
PostgreSQL
+ can use btree indexes on each branch via BitmapOr plans rather than forcing
+ a full table scan.
+ """
+
+ def to_orm(self, select: Select) -> Select:
+ if self.value is None:
+ return select
+
+ if self.value.lower_bound_gte:
+ x = self.value.lower_bound_gte
+ select = select.where(or_(self.attribute >= x,
and_(self.attribute.is_(None), func.now() >= x)))
Review Comment:
Good point, accepted for lower bounds. For a lower bound on a nullable
column, a NULL value (task still running / not yet started) satisfies the
filter unconditionally — its eventual value will be >= any past bound — so
`(col >= x) OR (col IS NULL)` is sufficient and cleaner.
Upper bounds are a different story. For `end_date <= X`, we only want to
include running tasks (NULL end_date) if the window extends to the present or
future, so the `NOW() <= x` guard is meaningful there: filtering
`end_date_lte=2020-01-01` should not return tasks still running in 2026.
Updated the lower bounds to use your suggestion, kept the upper bounds
as-is, and added tests to pin both behaviors.
##########
airflow-core/src/airflow/api_fastapi/common/parameters.py:
##########
@@ -898,6 +898,35 @@ def is_active(self) -> bool:
)
+class NullableDatetimeRangeFilter(RangeFilter):
+ """
+ RangeFilter for nullable datetime columns that preserves NULL-as-now
semantics.
+
+ Uses explicit OR conditions instead of COALESCE(column, now()) so
PostgreSQL
+ can use btree indexes on each branch via BitmapOr plans rather than forcing
+ a full table scan.
+ """
+
+ def to_orm(self, select: Select) -> Select:
+ if self.value is None:
+ return select
+
+ if self.value.lower_bound_gte:
+ x = self.value.lower_bound_gte
+ select = select.where(or_(self.attribute >= x,
and_(self.attribute.is_(None), func.now() >= x)))
+ if self.value.lower_bound_gt:
+ x = self.value.lower_bound_gt
+ select = select.where(or_(self.attribute > x,
and_(self.attribute.is_(None), func.now() > x)))
+ if self.value.upper_bound_lte:
+ x = self.value.upper_bound_lte
+ select = select.where(or_(self.attribute <= x,
and_(self.attribute.is_(None), func.now() <= x)))
+ if self.value.upper_bound_lt:
+ x = self.value.upper_bound_lt
+ select = select.where(or_(self.attribute < x,
and_(self.attribute.is_(None), func.now() < x)))
Review Comment:
BETWEEN would still need the same OR IS NULL branches on each side, so the
SQL ends up equivalent to what we have now. It also only covers the inclusive
`>= ... <=` case; the exclusive `>= ... <` variant (used in some callers) would
need separate handling. The two separate WHERE clauses already produce the
optimal BitmapOr plan — BETWEEN wouldn't simplify the generated SQL or help the
planner further.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]