hkc-8010 opened a new issue, #67525:
URL: https://github.com/apache/airflow/issues/67525
### Under which category would you file this issue?
Airflow Core
### Apache Airflow version
3.2.1
### What happened and how to reproduce it?
We found a count mismatch in the Airflow 3 UI where the Dashboard "Dag
Import Errors" badge can show a higher number than the import-errors modal and
CLI.
In the affected deployment, the home-page badge showed `4` import errors,
but:
- the import-errors modal showed only `2` files
- `airflow dags list-import-errors` showed only `2` files
- the metadata DB contained only `2` `import_error` rows
This appears to happen when one `ParseImportError` file is associated with
multiple DAGs in `dag`, causing the `/api/v2/importErrors` query to expand one
import error into multiple joined rows. The endpoint later groups those rows
back into one returned import-error object per file, but `total_entries`
appears to be counted before that grouping step.
The original customer report included a UI screenshot showing the count
mismatch. The screenshot is private support data, so I am not pasting the
private attachment URL here, but it can be manually attached when filing if
needed.
Evidence gathered during verification:
1. Internal verification on `2026-05-13T16:43:30Z` confirmed there were only
`2` rows in `import_error`.
2. Live verification on `2026-05-26` again showed only `2` import-error rows:
```text
(596, 2026-05-26 04:42:26.495552+00:00, 'dags/test_smtp_local.py', 'main')
(697, 2026-05-26 04:40:27.897028+00:00, 'dags/dwh_garantias_extraction.py',
'main')
```
3. Live `airflow dags list-import-errors` output on `2026-05-26` returned
only these `2` files:
```text
main | dags/dwh_garantias_extraction.py | TypeError: partial() got an
unexpected keyword argument 'file_format'
main | dags/test_smtp_local.py |
airflow.sdk.exceptions.AirflowRuntimeError: VARIABLE_NOT_FOUND: {'message':
'Variable AIRFLOW_CONN_SMTP_CONN not found'}
```
4. Live metadata query on `2026-05-26` showed that one import-error file
maps to multiple DAGs:
```text
## dag_counts
('dags/dwh_garantias_extraction.py', 'main', 1, 'dwh_garantias_extraction')
('dags/test_smtp_local.py', 'main', 3, 'smtp_check_emailoperator,
smtp_send_smtplib, test_smtp_local')
## import_error_rows
(596, 'dags/test_smtp_local.py', 'main', 2026-05-26 04:42:26.495552+00:00)
(697, 'dags/dwh_garantias_extraction.py', 'main', 2026-05-26
04:40:27.897028+00:00)
## joined_rows
(596, 'dags/test_smtp_local.py', 'main', 'smtp_check_emailoperator')
(596, 'dags/test_smtp_local.py', 'main', 'smtp_send_smtplib')
(596, 'dags/test_smtp_local.py', 'main', 'test_smtp_local')
(697, 'dags/dwh_garantias_extraction.py', 'main', 'dwh_garantias_extraction')
```
5. A direct aggregate over that join produced:
```text
(4, 2)
```
Where:
- `4` = raw joined row count
- `2` = distinct `import_error.id` count
That matches the user-visible mismatch exactly.
Relevant code paths:
- `airflow/api_fastapi/core_api/routes/public/import_error.py`
- builds the joined query around `select(ParseImportError,
file_dags_cte.c.dag_id)`
- groups the result later with `groupby(...)`
- `airflow/api_fastapi/common/db/common.py`
- `paginated_select()` computes `total_entries =
get_query_count(statement, session=session)` before any route-local grouping
- `airflow/ui/src/pages/Dashboard/Stats/DAGImportErrors.tsx`
- renders the Dashboard badge from `data?.total_entries`
Likely reproduction shape:
1. Create or retain a file that appears once in `import_error`.
2. Ensure that same file path is associated with multiple DAG IDs in `dag`.
3. Call `/api/v2/importErrors` and observe that `total_entries` reflects raw
joined rows rather than distinct import-error objects.
4. Observe that the UI badge uses `total_entries`, while the modal list
groups back down to fewer entries.
### What you think should happen instead?
The Dashboard badge, the modal, the CLI, and the DB-backed count should all
agree on the number of import-error files.
In this case they should all show `2`.
I suspect one of these fixes would resolve it:
1. Make `/api/v2/importErrors` count distinct `ParseImportError.id` values
after authorization logic instead of counting raw joined rows.
2. Restructure the route so pagination and counting happen on a deduplicated
import-error subquery rather than on the raw join.
3. Add a regression test where one `ParseImportError` file maps to multiple
DAGs, but `total_entries` still matches the number of distinct import-error
objects returned.
### Operating System
Not Applicable - managed Astronomer deployment
### Deployment
Astronomer
### Apache Airflow Provider(s)
Not Applicable
### Versions of Apache Airflow Providers
Not Applicable
### Official Helm Chart version
Not Applicable
### Kubernetes Version
Not Applicable
### Helm Chart configuration
Not Applicable
### Docker Image customizations
Unknown / not relevant for the API counting bug
### Anything else?
I did not find an obvious existing Airflow issue or PR for this exact
count-inflation behavior when searching for:
- `import errors count modal home page`
- `import error relative_fileloc bundle_name`
- `importErrors total_entries DagModel relative_fileloc bundle_name`
### Are you willing to submit PR?
- [x] Yes I am willing to submit a PR!
### Code of Conduct
- [x] I agree to follow this project's Code of Conduct
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]