GitHub user MrHenryD edited a discussion: DAGProcessor failing to process dag 
files after an unknown period of time

### Apache Airflow version
3.0.2 (Python 3.11)

### Deployment
* AWS EKS
* Kubernetes (via. Helm Chart v1.17.0)

### Deployment details
**airflow.cfg**
```
[core]
dagbag_import_timeout = 360
executor = KubernetesExecutor

[dag_processor]
bundle_refresh_check_interval = 240
dag_file_processor_timeout = 1800
disable_bundle_versioning = true
min_file_process_interval = 600
parsing_processes = 3
print_stats_interval = 300
refresh_interval = 600
stale_bundle_cleanup_interval = 3600
stale_bundle_cleanup_min_versions = 1
stale_dag_threshold = 1800

[scheduler]
dag_stale_not_seen_duration = 3600
parsing_cleanup_interval = 600
standalone_dag_processor = True
```

### What happened
Airflow DAGProcessor works fine initially but after a period of time, it fails 
to parse some DAG files which causes them to be disabled and not scheduled. I 
am not seeing any logs associated with why the error is occurring.

<img width="620" height="118" alt="image" 
src="https://github.com/user-attachments/assets/eb313a56-1dc8-4d36-af8e-c4b242efaf95";
 />

I see errors associated with a particular dag file path, but don't know where 
to check for errors (not seeing in logs).

In addition, there seems to be some instability around how long it takes to 
process these files. Sometimes a DAG could be processed within 10s and other 
times 40s (which is odd because it's being given a lot more resources than it's 
currently using).

I've tried to increase the resources (CPU / RAM) as well as changing airflow 
configurations so that the DAG parsing occurs less frequently and also has a 
larger timeout.

**CPU usage is very low for dag processor (was allocated 4cpus)**
<img width="1090" height="230" alt="image" 
src="https://github.com/user-attachments/assets/986a1cd2-f44c-44e5-b98c-a0383996217d";
 />

**Memory usage is also very low (was allocated 4gb ram)**
<img width="1086" height="227" alt="image" 
src="https://github.com/user-attachments/assets/7e869548-c446-45fa-8bc2-32b0c7d6c512";
 />


### How to reproduce
Don't know how to reproduce this. Some DAGs just fail to get parsed randomly.

### Anything else
No response


GitHub link: https://github.com/apache/airflow/discussions/54274

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]

Reply via email to