ferruzzi commented on code in PR #53215:
URL: https://github.com/apache/airflow/pull/53215#discussion_r2202217260
##########
airflow-core/src/airflow/models/deadline.py:
##########
@@ -98,6 +98,61 @@ def _determine_resource() -> tuple[str, str]:
f"{self.deadline_time} or run: {self.callback}({callback_kwargs})"
)
+ @classmethod
+ @provide_session
+ def remove_deadlines(cls, *, session: Session, conditions: dict[Column,
Any]) -> int:
+ """
+ Remove deadlines from the table which match the provided conditions
and return the number removed.
+
+ NOTE: This should only be used to remove deadlines which are
associated with
+ successful dagruns. If the deadline was missed, move it to the
`missed_deadlines`
+ table after executing the callback.
+ TODO: Create the missed_deadlines table (Ramit)
+
+ :param conditions: Dictionary of conditions to evaluate against.
+ :param session: Session to use.
+ """
+ from airflow.models import DagRun # Avoids circular import
+
+ # Assemble the filter conditions.
+ filter_conditions = [column == value for column, value in
conditions.items()]
+ if not filter_conditions:
+ return 0
+
+ try:
+ # Get deadlines which match the provided conditions and their
associated DagRuns.
+ deadline_dagrun_pairs = (
+ session.query(Deadline,
DagRun).join(DagRun).filter(and_(*filter_conditions)).all()
+ )
+ except SQLAlchemyError as e:
+ invalid_column = next(iter(conditions.keys())) # Get the first
key that caused the error.
+ message = f"Invalid column '{invalid_column}' specified in
conditions while resolving deadlines. Rolling back changes."
+ logger.exception(message)
+ session.rollback()
+ raise SQLAlchemyError(message) from e
+
+ if not deadline_dagrun_pairs:
+ return 0
+
+ deleted_count = 0
+ dagruns_to_refresh = set()
+
+ for deadline, dagrun in deadline_dagrun_pairs:
+ if dagrun.end_date <= deadline.deadline_time:
+ # If the DagRun finished before the Deadline:
+ session.delete(deadline)
+ deleted_count += 1
+ dagruns_to_refresh.add(dagrun)
+ session.flush()
+
+ logger.debug("%d deadline records were deleted matching the conditions
%s", deleted_count, conditions)
+
+ # Refresh any affected DAG runs.
+ for dagrun in dagruns_to_refresh:
+ session.refresh(dagrun)
Review Comment:
I debated this part. I could theoretically do this in the `for deadline,
dagrun...` loop above, but a dagrun may have multiple deadlines so this is
theoretically more efficient. I can move it into the loop above if anyone
feels strongly about it.
##########
airflow-core/src/airflow/models/deadline.py:
##########
@@ -98,6 +98,61 @@ def _determine_resource() -> tuple[str, str]:
f"{self.deadline_time} or run: {self.callback}({callback_kwargs})"
)
+ @classmethod
+ @provide_session
+ def remove_deadlines(cls, *, session: Session, conditions: dict[Column,
Any]) -> int:
+ """
+ Remove deadlines from the table which match the provided conditions
and return the number removed.
+
+ NOTE: This should only be used to remove deadlines which are
associated with
+ successful dagruns. If the deadline was missed, move it to the
`missed_deadlines`
+ table after executing the callback.
+ TODO: Create the missed_deadlines table (Ramit)
Review Comment:
@ramitkataria - Please remove this todo when you get this part done.
##########
airflow-core/tests/unit/models/test_dagrun.py:
##########
@@ -1256,13 +1256,14 @@ def test_dagrun_success_deadline(self, dag_maker,
session):
def on_success_callable(context):
assert context["dag_run"].dag_id == "test_dagrun_success_callback"
+ future_date = datetime.datetime.now() + datetime.timedelta(days=365)
+
with dag_maker(
dag_id="test_dagrun_success_callback",
schedule=datetime.timedelta(days=1),
- start_date=datetime.datetime(2017, 1, 1),
on_success_callback=on_success_callable,
deadline=DeadlineAlert(
- reference=DeadlineReference.FIXED_DATETIME(DEFAULT_DATE),
+ reference=DeadlineReference.FIXED_DATETIME(future_date),
Review Comment:
This was working with a date in the past because we weren't yet checking if
the deadline had passed. Now that we are checking it, we need to use a date
int he future here.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]