krithick-j opened a new issue, #49526:
URL: https://github.com/apache/airflow/issues/49526
### Apache Airflow version
2.10.5
### If "Other Airflow 2 version" selected, which one?
_No response_
### What happened?
When triggering a DAG run using the Airflow REST API (e.g., via a script
calling the /dags/{dag_id}/dagRuns endpoint), the task instances created in the
database (task_instance table) are inserted with a state of NULL instead of a
runnable state like 'queued' or 'scheduled'.
This prevents the Airflow scheduler from ever identifying these tasks as
ready for execution (No tasks to consider for execution. is seen in scheduler
logs), leaving the DAG run stuck indefinitely in the 'queued' state in the UI.
Analysis of PostgreSQL logs confirms the INSERT INTO task_instance query
generated by Airflow is missing the state and queued_dttm columns from the list
of columns being inserted.
Version Info
Airflow Version: 2.10.5
Installation: pip in a Python virtual environment, systemd services for
scheduler and webserver.
Executor: LocalExecutor
Database: PostgreSQL 16, running on localhost.
Python Version: [Likely Python 3.11 based on venv path - please confirm your
exact version]
OS: Ubuntu [Specify version if known, e.g., 22.04 LTS]
### What you think should happen instead?
_No response_
### How to reproduce
These steps assume a clean or representative Airflow 2.10.5 installation
with PostgreSQL.
Reproducible Steps
Set up Airflow:
Install Apache Airflow v2.10.5 in a Python virtual environment with
PostgreSQL as the database backend and LocalExecutor.
Ensure the database is initialized (airflow db init).
Configure Airflow:
Update your airflow.cfg (or environment variables) to correctly point
sql_alchemy_conn to your PostgreSQL database.
Set executor = LocalExecutor.
Set the core logging level to DEBUG: [core] logging_level = DEBUG.
Configure PostgreSQL Verbose Logging:
Edit your postgresql.conf file (/etc/postgresql/16/main/postgresql.conf or
similar).
Set log_connections = on.
Set log_disconnections = on.
Set log_min_duration_statement = 0ms (or log_statement = 'all').
Save the file and reload or restart the PostgreSQL service (sudo systemctl
reload postgresql or sudo systemctl restart postgresql).
Add a Test DAG:
Place a simple DAG file designed for API triggers into your dags_folder.
This DAG should have schedule_interval=None and define at least one task. (You
can use your generic_etl_dag.py or a simplified version).
Start Airflow Components:
Start the Airflow scheduler: airflow scheduler (or using your systemd
service sudo systemctl start airflow-scheduler).
Start the Airflow webserver: airflow webserver (or using your systemd
service sudo systemctl start airflow-webserver).
Trigger the DAG via API:
Use a tool like curl or a Python script (like your main.py) to trigger the
DAG using the Airflow REST API endpoint /api/v1/dags/{dag_id}/dagRuns. You'll
need to pass any params your DAG expects in the request body.
Example curl command (requires an API user/password set up):
Bash
curl -X POST 'http://localhost:8080/api/v1/dags/YOUR_DAG_ID/dagRuns' \
-H 'Content-Type: application/json' \
--user "your_api_user:your_api_password" \
-d '{"conf": { "user_id": "test_user", "connector_id": "test_connector" }}'
Replace YOUR_DAG_ID, your_api_user, your_api_password, and the conf content
as necessary for your DAG.
Observe Behavior:
Immediately after triggering, check the Airflow UI. The new DAG run will
appear and stay in the queued state.
Watch the Airflow scheduler logs (journalctl -u airflow-scheduler.service
-f). Look for DEBUG - No tasks to consider for execution. messages.
Watch the PostgreSQL logs (sudo tail -f /path/to/postgresql.log). Look for
the INSERT INTO task_instance statement generated for the new run ID.
Use psql to query the task_instance table for the new run ID and check the
state column (SELECT task_id, state, run_id FROM task_instance WHERE run_id =
'your_new_run_id';).
Expected Outcome of Reproduction (The Bug):
The DAG run remains stuck in the 'queued' state.
Scheduler logs show No tasks to consider for execution.
PostgreSQL logs show the INSERT INTO task_instance statement is missing the
state and queued_dttm columns.
Direct database query shows the task instances have state=NULL.
These steps should allow someone else to reproduce the issue on a similar
setup. Remember to include the relevant log snippets and query outputs in your
GitHub issue as diagnostic information.
### Operating System
Distributor ID: Ubuntu Description: Ubuntu 24.04.2 LTS Release:
24.04 Codename: noble
### Versions of Apache Airflow Providers
apache-airflow-providers-amazon 9.4.0
apache-airflow-providers-common-compat 1.5.1
apache-airflow-providers-common-io 1.2.0
apache-airflow-providers-common-sql 1.24.0
apache-airflow-providers-fab 1.5.3
apache-airflow-providers-ftp 3.7.0
apache-airflow-providers-http 4.8.0
apache-airflow-providers-imap 3.5.0
apache-airflow-providers-postgres 5.10.0
apache-airflow-providers-smtp 2.0.1
apache-airflow-providers-sqlite 3.7.0
### Deployment
Official Apache Airflow Helm Chart
### Deployment details
_No response_
### Anything else?
_No response_
### Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
### Code of Conduct
- [x] I agree to follow this project's [Code of
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]