1fanwang opened a new issue, #66462:
URL: https://github.com/apache/airflow/issues/66462

   ### Apache Airflow version
   
   3.x (verified against `main` at HEAD on 2026-05-06; same mechanism present 
in 2.9.x)
   
   ### What happened?
   
   When a DAG with `catchup=False` is paused, the **"Next Run"** value 
displayed in the UI (and returned by the FastAPI 
`/api/v2/dags/{dag_id}/details` `next_dagrun_logical_date` / 
`next_dagrun_run_after`) is recomputed each parse cycle against the current 
wall-clock time. The result is that the displayed value **moves forward by one 
cron period each day, but stays in the past relative to "now"**. To a user, 
this looks like the scheduler is broken — the date keeps changing daily, but 
never advances to a future time.
   
   Concrete example with cron `0 1 * * *` (one daily run, `catchup=False`):
   
   | When (UTC) | DAG state | UI "Next Run" shows | User's reading |
   |---|---|---|---|
   | Day 0, 23:30 | unpaused, prior run terminal | Day +1, 08:00 (future) | ✓ 
"tomorrow" |
   | Day 0, 23:39 | **user pauses** | Day +1, 08:00 (still future at this 
moment) | ✓ ok |
   | Day 1, 22:40 | parse runs while paused | Day 0, 08:00 (now in past) | ⚠ 
"stuck?" |
   | Day 2, 22:40 | parse runs while paused | Day 1, 08:00 (still in past) | ⚠ 
"drifting" |
   | Day 3, 22:40 | parse runs while paused | Day 2, 08:00 (still in past) | ❌ 
"broken" |
   
   The displayed date strictly increases by one cron period per day, but the 
user's *now* increases by the same amount, so the gap between "now" and "Next 
Run" never closes — the value is permanently in the past while the DAG remains 
paused. This often gets escalated as a scheduling regression before the user 
discovers that the DAG is simply paused.
   
   ### What you think should happen instead?
   
   For paused DAGs, "Next Run" is not predictive of when the next run will fire 
— the scheduler will not materialize any run while `is_paused=True`. The 
current display is therefore misleading.
   
   Three fix options, increasing in invasiveness:
   
   **(A) UI-only.** When `is_paused=True`, render "Next Run" as `—` or `Will 
run on unpause: <logical_date>` (with a tooltip clarifying that the value is 
the *interval that will fire on unpause*, not a scheduled future time). 
Lowest-risk option, addresses the user-facing confusion entirely. Touches 
`airflow-core/src/airflow/ui/src/pages/DagsList/DagCard.tsx:102-108` and the 
equivalent DAG-details view.
   
   **(B) FastAPI shape change.** Set `next_dagrun_logical_date` / 
`next_dagrun_run_after` to `null` (or to a separate `next_dagrun_on_unpause` 
field) when `is_paused=True`. Cleaner contract but breaks any external API 
consumer that reads those fields on paused DAGs. Touches 
`airflow-core/src/airflow/api_fastapi/core_api/datamodels/dags.py`.
   
   **(C) Scheduler short-circuit.** Skip `calculate_dagrun_date_fields` in 
`airflow-core/src/airflow/dag_processing/collection.py:638` when 
`is_paused=True`. Stops the rolling-forward computation at the source. Most 
invasive — would freeze `next_dagrun_*` at the value they had when the user 
paused; the existing "fires the missed interval immediately on unpause" 
behavior would need a recompute path on unpause to be preserved.
   
   (A) is the recommended fix — minimum surface, no behavior change, just a UI 
guard with a clearer label.
   
   ### How to reproduce
   
   1. Define a DAG with `catchup=False` and a daily cron, e.g. `0 1 * * *`. 
Start the scheduler.
   2. Wait for one scheduled run to complete. Confirm the UI shows "Next Run" 
at the next scheduled time (e.g., tomorrow 1 AM).
   3. Pause the DAG.
   4. Wait at least one full DAG-parse cycle past the next scheduled time 
(default scheduler `dag_dir_list_interval=300s`; for a clearer effect wait 
until the next day).
   5. Refresh the DAGs list page.
   
   Observed: "Next Run" displays a date in the past. Each subsequent day the 
date moves forward by one cron period but remains in the past.
   
   Expected: the UI does not display a misleading past-date "Next Run" while 
the DAG is paused.
   
   ### Operating System
   
   N/A — logic bug, not OS-specific
   
   ### Versions of Apache Airflow Providers
   
   N/A
   
   ### Deployment
   
   Other
   
   ### Deployment details
   
   Reproduces with `airflow standalone` on `main` HEAD as of 2026-05-06.
   
   ### Anything else?
   
   #### Source-code trace (verified against `apache/airflow` `main` at 
2026-05-06)
   
   The mechanism is unchanged from 2.9.x → 3.x:
   
   1. **Parse path always runs `calculate_dagrun_date_fields`, regardless of 
pause state.** `airflow-core/src/airflow/dag_processing/collection.py:563-638` 
— `update_dags()` iterates all DAGs and calls 
`dm.calculate_dagrun_date_fields(dag, last_automated_run=last_automated_run)` 
at line 638. There is no `is_paused` short-circuit anywhere in this loop.
   
   2. **`calculate_dagrun_date_fields` calls `next_dagrun_info` which routes to 
the timetable's `_skip_to_latest` for `catchup=False`.** 
`airflow-core/src/airflow/models/dag.py:757-800` calls 
`dag.next_dagrun_info(...)`. The TODO at `dag.py:769` already references the 
existing meta-bug #59618 about the helper's contract.
   
   3. **`_skip_to_latest` recomputes against `utcnow()` on every call.** 
`airflow-core/src/airflow/timetables/interval.py:156-178` 
(CronDataIntervalTimetable). Concrete trace at `current_time = 
2026-05-04T22:40:22+00:00` for cron `0 1 * * *`:
   
      ```
      current_time   = 5-4 22:40 UTC
      last_start     = _get_prev(5-4 22:40)             = 5-4 08:00 UTC
      next_start     = _get_next(5-4 08:00)             = 5-5 08:00 UTC
      # next_start > current_time, so:
      new_start      = _get_prev(last_start)            = 5-3 08:00 UTC
      ```
   
      Run the same code one day later at `5-5 22:40 UTC` and `new_start` 
becomes `5-4 08:00 UTC` — i.e., the value rolls forward by one cron period each 
day. The `RoundedDataIntervalTimetable._skip_to_latest` (`interval.py:237-250`) 
has the same `utcnow()`-based recomputation.
   
   4. **FastAPI returns these fields unchanged for paused DAGs.** 
`airflow-core/src/airflow/api_fastapi/core_api/datamodels/dags.py:84-108` — 
`is_paused: bool` and `next_dagrun_logical_date / 
next_dagrun_data_interval_start / next_dagrun_data_interval_end / 
next_dagrun_run_after` are all serialized as independent fields with no 
conditional logic between them.
   
   5. **React UI renders "Next Run" with no `is_paused` check.** 
`airflow-core/src/airflow/ui/src/pages/DagsList/DagCard.tsx:102-108`:
   
      ```tsx
      <Stat data-testid="next-run" label={translate("dagDetails.nextRun")}>
        {Boolean(dag.next_dagrun_run_after) ? (
          <DagRunInfo
            logicalDate={dag.next_dagrun_logical_date}
            runAfter={dag.next_dagrun_run_after as string}
          />
        ) : undefined}
      </Stat>
      ```
   
      The only `is_paused` check in this file (line 95) is on the spinner for 
the *Latest Run* — Next Run has no equivalent guard.
   
   #### Relationship to existing issues
   
   - **#59618** (`calculate_dagrun_date_fields doesn't understand how it is 
called`, OPEN, kind:bug area:Scheduler) is the in-tree TODO at `dag.py:769`, 
but it's about the helper's parameter contract (`last_automated_run` not always 
being the latest run, leading to backwards moves under concurrency) — different 
mechanism, different symptom. Multiple contributors are working on a rename + 
audit there.
   - **#54927**, **#55675**, **#50890** discuss adjacent paused/deactivated DAG 
behaviors but do not cover the "Next Run" display drift.
   - This issue is not a duplicate of any of those — it's specifically about 
the UX of surfacing the rolling-forward-past `next_dagrun_*` value to a user 
who has paused the DAG.
   
   #### Scope notes
   
   - Specific to `catchup=False`. With `catchup=True`, the timetable does not 
call `_skip_to_latest`, so the rolling-forward effect doesn't manifest the same 
way (paused-then-unpaused with `catchup=True` would attempt backfill of all 
missed intervals — different user-facing behavior).
   - The same display issue affects `next_dagrun_data_interval_start` and 
`next_dagrun_data_interval_end`.
   - The mechanism is correct for *unpaused* DAGs: it's how `catchup=False` 
knows to fire the most recent missed interval on schedule resumption. The 
problem is purely the UX of surfacing the rolling-forward value in a state 
where it's not predictive.
   - Adjacent: any DAG where run materialization is blocked while parsing 
continues (paused, `max_active_runs` exhausted, deactivated) will have the same 
display drift. Paused is the cleanest case to fix first; the others can follow.
   
   ### Are you willing to submit PR?
   
   - [x] Yes I am willing to submit a PR (Option A, UI-only)
   
   ### Code of Conduct
   
   - [x] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to