uranusjr opened a new issue #18758: URL: https://github.com/apache/airflow/issues/18758
### Description Currently (in versions up to 2.1.4), `airflow dags test <dag_id> <execution_date>` creates a *backfill* run at the specified datetime. This, however, applies regardless of whether the DAG can actually have a _logically_ automated backfill at that specific datetime or not. One example of this logically confusing behaviour is shown in #18473. A DAG with `schedule_interval=None` should logically have no backfill runs ever, but the `test` command would still happily create a backfill run at that datetime. With the introduction of custom timetables in AIP-39, the DAG scheduling logic went through some extensive refactoring to conform more closely to the DAG's schedule/timetable specification. This means that a backfill run can no longer be created at will. The 2.2 release will contain a hack to keep the current behaviour of "free" backfill run creation via `test` (#18742), but I would prefer this to be a temporary measure and be removed once we have a better solution. The root cause to this issue is, IMO, `airflow dags test` has very poor semantic as currently designed. It is entirely non-obvious it is creating *backfill* runs (and a subsequent `airflow dags backfill` call would therefore skip the specific datetime if and only if it lies on the logical schedule), nor why a backfill can happen without considering the schedule (it is the only way to do that in Airflow AFAIK). And the name `test` itself is somewhat a misnomer—why is creating a backfill run a test in the first place? ### Use case/motivation From what I can tell, the currently primary use case to `airflow dags test` is to check whether a DAG implements the tasks reasonably before it's activated. For this particular use case, the user does not actually care what kind of run is used, so a manual run would do. But we should also create a migration path for those relying on `airflow dags test` to create a backfill run, since the implied side effect of saving a backfill run later on is also somewhat useful. So the plan I currently have in mind is: * Add a new flag to `airflow dags trigger` to allow triggering a manual run and execute it directly in the console (instead of sending it to the scheduler). This will need some new mechanism since `trigger` is currently implemented by `DAG.create_dagrun()`. I think we'll need a new job class e.g. `ManualRunJob`. * Add a new flag to `airflow dags backfill` to do the same thing, but with a backfill run. This would cover the exact same use case as `airflow dags test` right now, but with more obvious semantics. The syntax would however be significantly more verbose, we need to work on that as well. * Deprecate `airflow dags test` since the its usage can be covered by the above two additions. ### Related issues Issue raised against 2.2 beta about the changed behaviour: #18473 PR to "restore" the pre-2.2 behaviour: #18742 ### Are you willing to submit a PR? - [X] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
