GitHub user ZhaoMJ closed a discussion: [DISCUSS] Allow 
manual/operator-triggered DAG runs with future logical_date

Hi all,
 
I'd like to discuss a change proposed in PR #65856 that relaxes the blanket ban 
on future `logical_date` for manually triggered and operator-triggered DAG 
runs. 
 
### Background
 
PR #46663 removed the `allow_trigger_in_future` config and hardcoded a block on 
any DAG run whose `logical_date` is in the future. This affects both the 
scheduler (`_schedule_dag_run`) and the `RunnableExecDateDep` task dependency. 
The result: when a user triggers a DAG with a future `logical_date`, the DagRun 
appears as "running" but tasks do not execute until the `logical_date` is 
reached. 
 
This is blocking our migration from Airflow 2 to 3 — we use `logical_date` to 
represent business session dates that don't align with calendar dates, and 
these runs need to execute immediately at trigger time.
 
Given that `run_after` is used for scheduling and `logical_date` is meant to 
"not contain any semantics, but is simply a value for logical identification," 
I believe it makes sense to relax the block on future `logical_date`. Please 
see the [PR description](https://github.com/apache/airflow/pull/65856) for the 
rationale, use cases, and more details.
 
### Proposed change

Skip the future `logical_date` block for `MANUAL` and `OPERATOR_TRIGGERED` run 
types only. Scheduled runs remain blocked. The existing `run_after <= now()` 
query filter still controls when the scheduler picks up a run, but dag runs 
with future `logical_date` will be able to execute immediately if `run_after` 
is set to `None` or `now()`. A "Run immediately" checkbox in the Trigger DAG UI 
gives users explicit control. Also added `run_after` to `TriggerDagRunOperator` 
and `TriggerDAGRunPayload` so that users can control this from the operator 
path as well. 

### Why I believe this is non-breaking

The default behavior is unchanged. The `run_after` field already controls when 
the scheduler picks up a run. Manual triggers via the Core API already allow 
setting `run_after=now()` with a future `logical_date` — this just removes the 
secondary block that prevents it from working.
 
### Broader question raised in review

`logical_date` is still used for TI prioritization in 
`_executable_task_instances_to_queued` and as a scheduling guard in 
`_schedule_dag_run`. Given that `logical_date` is defined as purely 
identificational ("does not contain any semantics") while `run_after` is 
defined as the scheduling control, should we decouple `logical_date` from 
scheduling entirely? I think that's a separate, larger discussion — this PR 
intentionally avoids changing TI ordering.
 
Looking for feedback on whether the approach is reasonable and if the broader 
decoupling warrants its own discussion thread.

Thanks,
Mingjie Zhao

GitHub link: https://github.com/apache/airflow/discussions/65949

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]

Reply via email to