Copilot commented on code in PR #62561:
URL: https://github.com/apache/airflow/pull/62561#discussion_r2926779967


##########
airflow-core/src/airflow/jobs/scheduler_job_runner.py:
##########
@@ -1981,6 +1981,9 @@ def _mark_backfills_complete(self, session: Session = 
NEW_SESSION) -> None:
         # todo: AIP-78 simplify this function to an update statement
         query = select(Backfill).where(
             Backfill.completed_at.is_(None),
+            # Guard: backfill must have at least one association,
+            # otherwise it is still being set up (see #61375).
+            exists(select(BackfillDagRun.id).where(BackfillDagRun.backfill_id 
== Backfill.id)),

Review Comment:
   The new `EXISTS(backfill_dag_run)` guard means a Backfill that gets 
committed but never manages to create any `BackfillDagRun` rows (e.g. if 
`_create_backfill()` errors/crashes after the `session.commit()` at 
`airflow/models/backfill.py:605`) will never be auto-completed by the 
scheduler. Since `_create_backfill()` blocks new backfills by counting 
`Backfill.completed_at IS NULL` (`airflow/models/backfill.py:577-590`), this 
can leave a DAG permanently unable to start new backfills without manual DB 
cleanup. Consider adding a bounded “initializing” window (e.g., only require 
the association for very recent backfills) or introducing an explicit backfill 
state/failed marker so initialization failures don’t create stuck active 
backfills.
   ```suggestion
           # Treat very recent backfills with no associations as "initializing",
           # but allow older ones without BackfillDagRun rows to be 
auto-completed
           # so they don't block new backfills if initialization failed.
           initializing_cutoff = now - timedelta(minutes=5)
           # todo: AIP-78 simplify this function to an update statement
           query = select(Backfill).where(
               Backfill.completed_at.is_(None),
               or_(
                   # Backfill has at least one association and is fully 
initialized.
                   
exists(select(BackfillDagRun.id).where(BackfillDagRun.backfill_id == 
Backfill.id)),
                   # Or it is older than the initializing window; treat it as 
no longer initializing
                   # even if it has no BackfillDagRun rows (e.g. initialization 
crashed).
                   Backfill.created_at < initializing_cutoff,
               ),
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to