Nataneljpwd commented on code in PR #60216:
URL: https://github.com/apache/airflow/pull/60216#discussion_r2741536332


##########
airflow-core/src/airflow/api_fastapi/core_api/routes/public/backfills.py:
##########
@@ -246,10 +246,10 @@ def create_backfill(
         )
         return BackfillResponse.model_validate(backfill_obj)
 
-    except AlreadyRunningBackfill:
+    except AlreadyRunningBackfill as e:
         raise HTTPException(
             status_code=status.HTTP_409_CONFLICT,
-            detail=f"There is already a running backfill for dag 
{backfill_request.dag_id}",
+            detail=str(e),

Review Comment:
   Will the exception propagated here include details about the overlap? I.e 
what dates overlap?



##########
airflow-core/src/airflow/ui/src/components/Banner/BackfillBanner.tsx:
##########
@@ -64,7 +65,9 @@ const BackfillBanner = ({ dagId }: Props) => {
           : false,
     },
   );
-  const [backfill] = data?.backfills.filter((bf: BackfillResponse) => 
bf.completed_at === null) ?? [];
+  const activeBackfills = data?.backfills.filter((bf: BackfillResponse) => 
bf.completed_at === null) ?? [];

Review Comment:
   Isn't this guaranteed to be the case? Where the completed at is null?
   I am less familiar with the ui code, so correct me if I am wrong



##########
airflow-core/tests/unit/models/test_backfill.py:
##########
@@ -363,18 +368,54 @@ def test_active_dag_run(dag_maker, session):
         dag_run_conf={"this": "param"},
     )
     assert b1 is not None
-    with pytest.raises(AlreadyRunningBackfill, match="Another backfill is 
running for dag"):
+
+    # Try to create overlapping backfill - should fail
+    with pytest.raises(AlreadyRunningBackfill, match="Another backfill is 
running for Dag"):

Review Comment:
   Maybe the exception needs to be changed to inform the user that an 
overlapping backfill is present, not just any backfill



##########
airflow-core/src/airflow/models/backfill.py:
##########
@@ -503,18 +503,25 @@ def _create_backfill(
         if no_schedule:
             raise DagNoScheduleException(f"{dag_id} has no schedule")
 
-        num_active = session.scalar(
-            select(func.count()).where(
+        # Check for overlapping date ranges with active backfills
+        # Two date ranges overlap if: new_from_date <= existing_to_date
+        # AND new_to_date >= existing_from_date
+        overlapping_backfills = session.scalars(
+            select(Backfill).where(
                 Backfill.dag_id == dag_id,
                 Backfill.completed_at.is_(None),
+                Backfill.from_date <= to_date,
+                Backfill.to_date >= from_date,
             )
-        )
-        if num_active is None:
-            raise UnknownActiveBackfills(dag_id)
-        if num_active > 0:
+        ).all()
+
+        if overlapping_backfills:
+            active_ranges = [
+                f"{bf.from_date.isoformat()} to {bf.to_date.isoformat()}" for 
bf in overlapping_backfills

Review Comment:
   If we have more than 1 overlap, the ui message can get quite cluttered, 
maybe it is worth limiting to only 2? I know it loses data, and might require 
another try, but the message will be clean and easy enough to read



##########
airflow-core/src/airflow/models/backfill.py:
##########
@@ -503,18 +503,25 @@ def _create_backfill(
         if no_schedule:
             raise DagNoScheduleException(f"{dag_id} has no schedule")
 
-        num_active = session.scalar(
-            select(func.count()).where(
+        # Check for overlapping date ranges with active backfills
+        # Two date ranges overlap if: new_from_date <= existing_to_date
+        # AND new_to_date >= existing_from_date
+        overlapping_backfills = session.scalars(
+            select(Backfill).where(
                 Backfill.dag_id == dag_id,
                 Backfill.completed_at.is_(None),
+                Backfill.from_date <= to_date,
+                Backfill.to_date >= from_date,
             )
-        )

Review Comment:
   I think we can just select only 2 rows here, and only select the needed 
columns, it should be lighter than what currently exists, yet still provide all 
the required data for ux, as we will be able to, at each stage, run a backfill 
job with a from and to date



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to