grassten opened a new issue, #24297:
URL: https://github.com/apache/airflow/issues/24297

   ### Apache Airflow version
   
   2.3.2 (latest released)
   
   ### What happened
   
   When running `airflow tasks clear` command, we get the following error. 
   
   ```
   [2022-06-07 15:59:58,353] {{dagbag.py:507}} INFO - Filling up the DagBag 
from /usr/local/airflow
   Traceback (most recent call last):
     File "/usr/local/bin/airflow", line 8, in <module>
       sys.exit(main())
     File "/usr/local/lib/python3.8/dist-packages/airflow/__main__.py", line 
38, in main
       args.func(args)
     File "/usr/local/lib/python3.8/dist-packages/airflow/cli/cli_parser.py", 
line 51, in command
       return func(*args, **kwargs)
     File "/usr/local/lib/python3.8/dist-packages/airflow/utils/cli.py", line 
99, in wrapper
       return f(*args, **kwargs)
     File 
"/usr/local/lib/python3.8/dist-packages/airflow/cli/commands/task_command.py", 
line 591, in task_clear
       dags = get_dags(args.subdir, args.dag_id, use_regex=args.dag_regex)
     File "/usr/local/lib/python3.8/dist-packages/airflow/utils/cli.py", line 
214, in get_dags
       return [get_dag(subdir, dag_id)]
     File "/usr/local/lib/python3.8/dist-packages/airflow/utils/cli.py", line 
201, in get_dag
       dagbag = DagBag(process_subdir(subdir))
     File "/usr/local/lib/python3.8/dist-packages/airflow/models/dagbag.py", 
line 130, in __init__
       self.collect_dags(
     File "/usr/local/lib/python3.8/dist-packages/airflow/models/dagbag.py", 
line 514, in collect_dags
       for filepath in list_py_file_paths(
     File "/usr/local/lib/python3.8/dist-packages/airflow/utils/file.py", line 
305, in list_py_file_paths
       file_paths.extend(find_dag_file_paths(directory, safe_mode))
     File "/usr/local/lib/python3.8/dist-packages/airflow/utils/file.py", line 
323, in find_dag_file_paths
       for file_path in find_path_from_directory(str(directory), 
".airflowignore"):
     File "/usr/local/lib/python3.8/dist-packages/airflow/utils/file.py", line 
242, in _find_path_from_directory
       raise RuntimeError(
   RuntimeError: Detected recursive loop when walking DAG directory 
/usr/local/airflow: /usr/local/airflow/logs/splunk/scheduler/2022-06-07 has 
appeared more than once.
   ```
   
   Looking at this directory, I see `2022-06-07` and `latest`, which is a 
symlink to `2022-06-07`. 
   
   The error is being raised from here 
https://github.com/apache/airflow/blob/0bf5f495d4131109fba449697adee68a62516851/airflow/utils/file.py#L242
   
   child_process_log_directory = /usr/local/airflow/logs/splunk/scheduler in 
our airflow.cfg
   
   ### What you think should happen instead
   
   Clear command should run successfully.
   
   ### How to reproduce
   
   My understanding is that if you have `2022-06-07` and `latest` within your 
scheduler logging directory, and you try to clear a task, the CLI command would 
fail. 
   
   ### Operating System
   
   Linux
   
   ### Versions of Apache Airflow Providers
   
   _No response_
   
   ### Deployment
   
   Docker-Compose
   
   ### Deployment details
   
   _No response_
   
   ### Anything else
   
   As a workaround, adding `logs/splunk/scheduler/latest` to the .airflowignore 
resolved the issue for us. 
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to