howardyoo opened a new issue #21023:
URL: https://github.com/apache/airflow/issues/21023
### Apache Airflow version
main (development)
### What happened
# Product and Version
Airflow Version: v2.3.0.dev0 (Git Version:
.release:2.3.0.dev0+7a9ab1d7170567b1d53938b2f7345dae2026c6ea) to test and learn
its functionalities. I am currently installed this using git clone and building
the airflow on my MacOS environment, using python3.9.
# Problem Statement
When I was doing a test on my DAG, I wanted run
`airflow dags test <dag_id> <execution_dt>` so that I don't have to use UI
to trigger dag runs each time. Running and looking at the result of the dags
test proved to be more productive when doing some rapid tests on your DAG.
The test runs perfectly for the first time it runs, but when I try to re-run
the test again the following error message is observed:
```
[2022-01-21 10:30:33,530] {migration.py:201} INFO - Context impl SQLiteImpl.
[2022-01-21 10:30:33,530] {migration.py:204} INFO - Will assume
non-transactional DDL.
[2022-01-21 10:30:33,568] {dagbag.py:498} INFO - Filling up the DagBag from
/Users/howardyoo/airflow/dags
[2022-01-21 10:30:33,588] {example_python_operator.py:67} WARNING - The
virtalenv_python example task requires virtualenv, please install it.
[2022-01-21 10:30:33,594] {tutorial_taskflow_api_etl_virtualenv.py:29}
WARNING - The tutorial_taskflow_api_etl_virtualenv example DAG requires
virtualenv, please install it.
Traceback (most recent call last):
File "/Users/howardyoo/python3/bin/airflow", line 33, in <module>
sys.exit(load_entry_point('apache-airflow==2.3.0.dev0',
'console_scripts', 'airflow')())
File
"/Users/howardyoo/python3/lib/python3.9/site-packages/airflow/__main__.py",
line 48, in main
args.func(args)
File
"/Users/howardyoo/python3/lib/python3.9/site-packages/airflow/cli/cli_parser.py",
line 50, in command
return func(*args, **kwargs)
File
"/Users/howardyoo/python3/lib/python3.9/site-packages/airflow/utils/session.py",
line 71, in wrapper
return func(*args, session=session, **kwargs)
File
"/Users/howardyoo/python3/lib/python3.9/site-packages/airflow/utils/cli.py",
line 98, in wrapper
return f(*args, **kwargs)
File
"/Users/howardyoo/python3/lib/python3.9/site-packages/airflow/cli/commands/dag_command.py",
line 429, in dag_test
dag.clear(start_date=args.execution_date, end_date=args.execution_date,
dag_run_state=State.NONE)
File
"/Users/howardyoo/python3/lib/python3.9/site-packages/airflow/utils/session.py",
line 71, in wrapper
return func(*args, session=session, **kwargs)
File
"/Users/howardyoo/python3/lib/python3.9/site-packages/airflow/models/dag.py",
line 1906, in clear
clear_task_instances(
File
"/Users/howardyoo/python3/lib/python3.9/site-packages/airflow/models/taskinstance.py",
line 286, in clear_task_instances
dr.state = dag_run_state
File "<string>", line 1, in __set__
File
"/Users/howardyoo/python3/lib/python3.9/site-packages/airflow/models/dagrun.py",
line 207, in set_state
raise ValueError(f"invalid DagRun state: {state}")
ValueError: invalid DagRun state: None
```
When going through the DAG runs in my UI, I noticed the following entry on
my dag test run.

Looks like when you run the dag with `test` mode, it submits the dag run as
`backfill` type. I am not completely sure why the `airflow dags test` would
only succeed once, but looks like there might have been some process that may
be missing to clear out the test (just my theory).
# Workaround
A viable workaround to stop it from failing is to find and `deleting` the
dag run instance. Once the above dag run entry is deleted, I could successfully
run my `airflow dags test` command again.
### What you expected to happen
According to the documentation
(https://airflow.apache.org/docs/apache-airflow/stable/tutorial.html#id2), it
is stated that:
> The same applies to airflow dags test [dag_id] [logical_date], but on a
DAG level. It performs a single DAG run of the given DAG id. While it does take
task dependencies into account, no state is registered in the database. It is
convenient for locally testing a full run of your DAG, given that e.g. if one
of your tasks expects data at some location, it is available.
It does not mention about whether you have to delete the dag run instance to
re-run the test, so I would expect that `airflow dags test` command will run
successfully, and also successfully on any consecutive runs without any errors.
### How to reproduce
- Get the reported version of airflow and install it to run.
- Run airflow standalone using `airflow standalone` command. It should start
up the basic webserver, scheduler, triggerer to start testing it.
- Get any dags that exist in the DAGs. run `airflow dags test <dag_id>
<start_dt>` to initiate DAGs test.
- Once the test is finished, re-run the command and observe the error.
- Go to the DAG runs, delete the dag run that the first run produced, and
run the test again - the test should run successfully.
### Operating System
MacOS Monterey (Version 12.1)
### Versions of Apache Airflow Providers
No providers were used
### Deployment
Other
### Deployment details
This airflow is running as a `standalone` on my local MacOS environment. I
have setup a dev env, by cloning from the github and built the airflow to run
locally. It is using sqlite as its backend database, and sequentialExecutor to
execute tasks sequentially.
### Anything else
Nothing much. I would like this issue to be resolved so that I could run my
DAG tests easily without 'actually' running it or relying on the UI. Also,
there seems to be little information on what this `test` means and what it is
different from the normal runs, so improving documentation to clarify it would
be nice.
### Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
### Code of Conduct
- [X] I agree to follow this project's [Code of
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]