howardyoo opened a new issue #21023:
URL: https://github.com/apache/airflow/issues/21023


   ### Apache Airflow version
   
   main (development)
   
   ### What happened
   
   # Product and Version
   Airflow Version: v2.3.0.dev0 (Git Version: 
.release:2.3.0.dev0+7a9ab1d7170567b1d53938b2f7345dae2026c6ea) to test and learn 
its functionalities. I am currently installed this using git clone and building 
the airflow on my MacOS environment, using python3.9.
   
   # Problem Statement
   When I was doing a test on my DAG, I wanted run 
   `airflow dags test <dag_id> <execution_dt>` so that I don't have to use UI 
to trigger dag runs each time. Running and looking at the result of the dags 
test proved to be more productive when doing some rapid tests on your DAG.
   
   The test runs perfectly for the first time it runs, but when I try to re-run 
the test again the following error message is observed:
   ```
   [2022-01-21 10:30:33,530] {migration.py:201} INFO - Context impl SQLiteImpl.
   [2022-01-21 10:30:33,530] {migration.py:204} INFO - Will assume 
non-transactional DDL.
   [2022-01-21 10:30:33,568] {dagbag.py:498} INFO - Filling up the DagBag from 
/Users/howardyoo/airflow/dags
   [2022-01-21 10:30:33,588] {example_python_operator.py:67} WARNING - The 
virtalenv_python example task requires virtualenv, please install it.
   [2022-01-21 10:30:33,594] {tutorial_taskflow_api_etl_virtualenv.py:29} 
WARNING - The tutorial_taskflow_api_etl_virtualenv example DAG requires 
virtualenv, please install it.
   Traceback (most recent call last):
     File "/Users/howardyoo/python3/bin/airflow", line 33, in <module>
       sys.exit(load_entry_point('apache-airflow==2.3.0.dev0', 
'console_scripts', 'airflow')())
     File 
"/Users/howardyoo/python3/lib/python3.9/site-packages/airflow/__main__.py", 
line 48, in main
       args.func(args)
     File 
"/Users/howardyoo/python3/lib/python3.9/site-packages/airflow/cli/cli_parser.py",
 line 50, in command
       return func(*args, **kwargs)
     File 
"/Users/howardyoo/python3/lib/python3.9/site-packages/airflow/utils/session.py",
 line 71, in wrapper
       return func(*args, session=session, **kwargs)
     File 
"/Users/howardyoo/python3/lib/python3.9/site-packages/airflow/utils/cli.py", 
line 98, in wrapper
       return f(*args, **kwargs)
     File 
"/Users/howardyoo/python3/lib/python3.9/site-packages/airflow/cli/commands/dag_command.py",
 line 429, in dag_test
       dag.clear(start_date=args.execution_date, end_date=args.execution_date, 
dag_run_state=State.NONE)
     File 
"/Users/howardyoo/python3/lib/python3.9/site-packages/airflow/utils/session.py",
 line 71, in wrapper
       return func(*args, session=session, **kwargs)
     File 
"/Users/howardyoo/python3/lib/python3.9/site-packages/airflow/models/dag.py", 
line 1906, in clear
       clear_task_instances(
     File 
"/Users/howardyoo/python3/lib/python3.9/site-packages/airflow/models/taskinstance.py",
 line 286, in clear_task_instances
       dr.state = dag_run_state
     File "<string>", line 1, in __set__
     File 
"/Users/howardyoo/python3/lib/python3.9/site-packages/airflow/models/dagrun.py",
 line 207, in set_state
       raise ValueError(f"invalid DagRun state: {state}")
   ValueError: invalid DagRun state: None
   ```
   When going through the DAG runs in my UI, I noticed the following entry on 
my dag test run.
   ![Screen Shot 2022-01-21 at 10 31 52 
AM](https://user-images.githubusercontent.com/32691630/150564356-f8b95b11-794a-451e-b5ad-ab9b59f3b52b.png)
   Looks like when you run the dag with `test` mode, it submits the dag run as 
`backfill` type. I am not completely sure why the `airflow dags test` would 
only succeed once, but looks like there might have been some process that may 
be missing to clear out the test (just my theory).
   
   # Workaround
   A viable workaround to stop it from failing is to find and `deleting` the 
dag run instance. Once the above dag run entry is deleted, I could successfully 
run my `airflow dags test` command again.
   
   
   ### What you expected to happen
   
   According to the documentation 
(https://airflow.apache.org/docs/apache-airflow/stable/tutorial.html#id2), it 
is stated that:
   
   > The same applies to airflow dags test [dag_id] [logical_date], but on a 
DAG level. It performs a single DAG run of the given DAG id. While it does take 
task dependencies into account, no state is registered in the database. It is 
convenient for locally testing a full run of your DAG, given that e.g. if one 
of your tasks expects data at some location, it is available.
   
   It does not mention about whether you have to delete the dag run instance to 
re-run the test, so I would expect that `airflow dags test` command will run 
successfully, and also successfully on any consecutive runs without any errors.
   
   ### How to reproduce
   
   - Get the reported version of airflow and install it to run.
   - Run airflow standalone using `airflow standalone` command. It should start 
up the basic webserver, scheduler, triggerer to start testing it.
   - Get any dags that exist in the DAGs. run `airflow dags test <dag_id> 
<start_dt>` to initiate DAGs test.
   - Once the test is finished, re-run the command and observe the error.
   - Go to the DAG runs, delete the dag run that the first run produced, and 
run the test again - the test should run successfully.
   
   ### Operating System
   
   MacOS Monterey (Version 12.1)
   
   ### Versions of Apache Airflow Providers
   
   No providers were used
   
   ### Deployment
   
   Other
   
   ### Deployment details
   
   This airflow is running as a `standalone` on my local MacOS environment. I 
have setup a dev env, by cloning from the github and built the airflow to run 
locally. It is using sqlite as its backend database, and sequentialExecutor to 
execute tasks sequentially.
   
   ### Anything else
   
   Nothing much. I would like this issue to be resolved so that I could run my 
DAG tests easily without 'actually' running it or relying on the UI. Also, 
there seems to be little information on what this `test` means and what it is 
different from the normal runs, so improving documentation to clarify it would 
be nice.
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to