aru-trackunit opened a new issue, #58554:
URL: https://github.com/apache/airflow/issues/58554

   ### Apache Airflow version
   
   3.1.3
   
   ### If "Other Airflow 2/3 version" selected, which one?
   
   _No response_
   
   ### What happened?
   
   Backfill task is stuck in "No status" state. 
   
   When triggering a backfill for 2025-11-17, I successfully pass the form 
step, the dag run is successfully created and is in running status, however the 
task itself does not start, and is state in database is null. 
   
   I have restarted all the airflow kubernetes pods, removed a dag data using 
delete dag and parse it from the beginning. 
   
   I have checked parallelism which is set to 40 and max tasks runnning were 
aroundn 30, there are 93 open slots in the pool.
   
   `dag` record:
   | dag_id | is_paused | is_stale | last_parsed_time | last_parse_duration | 
last_expired | fileloc | relative_fileloc | bundle_name | bundle_version | 
owners | dag_display_name | description | timetable_summary | 
timetable_description | asset_expression | deadline | max_active_tasks | 
max_active_runs | max_consecutive_failed_dag_runs | has_task_concurrency_limits 
| has_import_errors | next_dagrun | next_dagrun_data_interval_start | 
next_dagrun_data_interval_end | next_dagrun_create_after |
   | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - 
| - | - | -| - | - | - | - |
   | ble-monitoring | FALSE | FALSE | 2025-11-21 10:45:56.027054+00 | 
0.14697423899997375 |  | /opt/airflow/dags/dashboards/dag_ble_monitoring.py | 
dashboards/dag_ble_monitoring.py | dags-folder |  | team_iot_data |  | BLE 
Monitoring Dashboard on PowerBi | 30 0 * * * | At 00:30 | null | null | 16 | 1 
| 0 | FALSE | FALSE | 2025-11-21 00:30:00+00 | 2025-11-21 00:30:00+00 | 
2025-11-22 00:30:00+00 | 2025-11-22 00:30:00+00 |
   
   
   
   `dag_run` record:
   | id | dag_id | queued_at | logical_date | start_date | end_date | state | 
run_id | creating_job_id | run_type | triggered_by | triggering_user_name | 
conf | data_interval_start | data_interval_end | run_after | 
last_scheduling_decision | log_template_id | updated_at | clear_number | 
backfill_id | bundle_version | scheduled_by_job_id | context_carrier | 
span_status | created_dag_version_id |
   | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - 
| - | - | - | - | - | - | - |
   | 8980 | ble-monitoring |2025-11-21 10:31:50.877612+00 | 2025-11-11 
00:30:00+00 | 2025-11-21 10:31:51.01211+00 | NULL | running | 
backfill__2025-11-12T00:30:00+00:00 | null | backfill | BACKFILL | 
okta_00ud5ffefmodlaFL2357 | {} | 2025-11-11 00:30:00+00 | 2025-11-12 
00:30:00+00 | 2025-11-12 00:30:00+00 | null | 1 | 2025-11-21 10:31:51.012965+00 
| 0 | 9 | null | null |{"__var": {}, "__type": "dict"} | not_started | 
019aa5d3-ae8f-7d68-9ef4-b97916d1470b |
   
   `task_run` record:
   | task_id | state | pool | queue | start_date | end_date |  try_number |
   |-|-|-|-|-|-|-|
   | ble_monitoring | NULL | default_pool | default | NULL | NULL | 0 |
   
   `backfill` record:
   | id | dag_id | from_date | to_date | dag_run_conf | is_paused | 
reprocess_behavior | max_active_runs | created_at | completed_at | updated_at | 
triggering_user_name |
   |-|-|-|-|-|-|-|-|-|-|-|-|
   | 9 | ble-monitoring | 2025-11-11 00:30:00+00 | 2025-11-11 00:30:00+00 | {} 
| FALSE | completed | 1 |        2025-11-21 10:31:50.861649+00 | NULL | 
2025-11-21 10:31:50.861653+00 | okta_00ud5ffefmodlaFL2357 |
   
   `backfill_dag_run` record:
   | id | backfill_id | dag_run_id | exception_reason | logical_date | 
sort_ordinal |
   |-|-|-|-|-|-|
   | 9 | 9 | 8980 | NULL | 2025-11-11 00:30:00+00 | 1 |
   
   I wonder if I am the first one to see this - Do you think might be the case 
that database state is corrupted?
   
   <img width="1078" height="452" alt="Image" 
src="https://github.com/user-attachments/assets/970cf45f-5609-4a38-b1bf-2da2ffdad8ac";
 />
   
   ### What you think should happen instead?
   
   I would expect backfill runs to have executed and completed or failed.
   
   ### How to reproduce
   
   1. Click on Backfill button in a Dag
   2. Pick any date that fits the and Run Backfill
   
   <img width="1201" height="839" alt="Image" 
src="https://github.com/user-attachments/assets/a79a86b8-a4ba-4484-9ecd-fdedf4934e31";
 />
   
   ### Operating System
   
   Debian GNU/Linux 12 (bookworm)
   
   ### Versions of Apache Airflow Providers
   
   ```
   apache-airflow-providers-amazon          9.17.0
   apache-airflow-providers-cncf-kubernetes 10.10.0
   apache-airflow-providers-common-compat   1.8.0
   apache-airflow-providers-common-io       1.6.4
   apache-airflow-providers-common-sql      1.29.0
   apache-airflow-providers-databricks      7.7.5
   apache-airflow-providers-fab             3.0.2
   apache-airflow-providers-github          2.9.4
   apache-airflow-providers-hashicorp       4.3.4
   apache-airflow-providers-http            5.5.0
   apache-airflow-providers-microsoft-mssql 4.3.3
   apache-airflow-providers-mysql           6.3.5
   apache-airflow-providers-postgres        6.4.1
   apache-airflow-providers-sftp            5.4.2
   apache-airflow-providers-slack           9.4.0
   apache-airflow-providers-smtp            2.3.1
   apache-airflow-providers-ssh             4.1.6
   apache-airflow-providers-standard        1.9.1
   ```
   
   ### Deployment
   
   Official Apache Airflow Helm Chart
   
   ### Deployment details
   
   Example dag:
   ```
   import datetime as dt
   from collections import defaultdict
   
   from airflow_addons.operators.databricks import DatabricksSubmitRunOperator
   
   from airflow.sdk import DAG, Variable
   
   DAG_ID = "ble-monitoring"
   
   default_args = {
       "owner": "team_iot_data",
       "depends_on_past": False,
       "start_date": dt.datetime(2025, 11, 1),
   }
   
   ble_monitoring_task = {
       "notebook_path": "ble_monitoring/ble_monitoring_data_refresh",
       "base_parameters": {
           "COMPUTE_FOR_DATE": "{{ data_interval_start.strftime('%Y-%m-%d') }}"
       }
   }
   
   with DAG(
           dag_id=DAG_ID,
           description="BLE Monitoring Dashboard on PowerBi",
           default_args=default_args,
           schedule="30 0 * * *",
           catchup=False,
           max_active_runs=1,
           tags={"dashboards"},
           doc_md=__doc__,
   ) as dag:
   
       DatabricksSubmitRunOperator(
           task_id="ble_monitoring",
           new_cluster={},
           notebook_task=ble_monitoring_task,
           timeout_seconds=3600,  # 1 hours
           polling_period_seconds=30,
           retries=0,
           databricks_retry_limit=10,
           databricks_retry_delay=10,
           do_xcom_push=True,
           git_url="https://github.com/repo/";,
       )
   
   ```
   
   env vars:
   ```
   AIRFLOW_API_SERVER_PORT_8080_TCP_PROTO=tcp
   AIRFLOW_STATSD_PORT_9102_TCP_PROTO=tcp
   
AIRFLOW__KUBERNETES_ENVIRONMENT_VARIABLES__AIRFLOW__SCHEDULER__CREATE_CRON_DATA_INTERVALS=True
   AIRFLOW__WEBSERVER__SHOW_TRIGGER_FORM_IF_NO_PARAMS=True
   AIRFLOW_API_SERVER_PORT_8080_TCP=tcp://172.20.254.149:8080
   AIRFLOW_API_SERVER_PORT=tcp://172.20.254.149:8080
   AIRFLOW_STATSD_SERVICE_PORT_STATSD_INGEST=9125
   AIRFLOW_USER_HOME_DIR=/home/airflow
   AIRFLOW__CORE__DAGS_FOLDER=/opt/airflow/dags
   AIRFLOW_PGBOUNCER_PORT_6543_TCP_PORT=6543
   
   AIRFLOW__CORE__PARALLELISM=40
   AIRFLOW_PGBOUNCER_PORT_9127_TCP_PORT=9127
   
AIRFLOW__KUBERNETES_ENVIRONMENT_VARIABLES__AIRFLOW_AUTH_ROLES_MAPPING={"app_airflow_prod_admin":
 [ "Admin" ], "app_airflow_prod_user": [ "User" ], "app_airflow_prod_viewer": [ 
"Viewer" ]}
   AIRFLOW_API_SERVER_PORT_8080_TCP_ADDR=172.20.254.149
   AIRFLOW_VERSION=3.1.3
   AIRFLOW__CORE__LOAD_EXAMPLES=false
   AIRFLOW_STATSD_PORT_9125_UDP_ADDR=172.20.167.106
   AIRFLOW_URL_REDIRECT_URI=https://redirect-url/auth/oauth-authorized/okta
   AIRFLOW_API_SERVER_SERVICE_PORT=8080
   AIRFLOW_STATSD_PORT_9102_TCP=tcp://172.20.167.106:9102
   AIRFLOW_PGBOUNCER_SERVICE_HOST=172.20.108.237
   AIRFLOW_HOME=/opt/airflow
   AIRFLOW_STATSD_SERVICE_PORT=9125
   AIRFLOW_USE_UV=false
   AIRFLOW_PGBOUNCER_PORT_6543_TCP_ADDR=172.20.108.237
   AIRFLOW_PGBOUNCER_PORT_9127_TCP_ADDR=172.20.108.237
   AIRFLOW_PIP_VERSION=25.3
   
   AIRFLOW_STATSD_PORT_9102_TCP_ADDR=172.20.167.106
   AIRFLOW_STATSD_PORT_9125_UDP_PROTO=udp
   AIRFLOW__KUBERNETES_ENVIRONMENT_VARIABLES__AIRFLOW__CORE__PARALLELISM=40
   AIRFLOW_PGBOUNCER_PORT=tcp://172.20.108.237:6543
   AIRFLOW_PGBOUNCER_SERVICE_PORT_PGB_METRICS=9127
   AIRFLOW__KUBERNETES_ENVIRONMENT_VARIABLES__AIRFLOW_VAR_ENV_NAME=prod
   AIRFLOW_UV_VERSION=0.9.9
   AIRFLOW__KUBERNETES_ENVIRONMENT_VARIABLES__PYTHONASYNCIODEBUG=1
   AIRFLOW_PGBOUNCER_SERVICE_PORT_PGBOUNCER=6543
   AIRFLOW_STATSD_PORT_9125_UDP_PORT=9125
   AIRFLOW_PGBOUNCER_PORT_6543_TCP_PROTO=tcp
   AIRFLOW_PGBOUNCER_PORT_9127_TCP=tcp://172.20.108.237:9127
   
   AIRFLOW_STATSD_PORT_9102_TCP_PORT=9102
   AIRFLOW_STATSD_SERVICE_PORT_STATSD_SCRAPE=9102
   AIRFLOW_VAR_ENV_NAME=prod
   AIRFLOW_STATSD_SERVICE_HOST=172.20.167.106
   
AIRFLOW__KUBERNETES_ENVIRONMENT_VARIABLES__AIRFLOW_URL_REDIRECT_URI=https://redirect-url/auth/oauth-authorized/okta
   
   AIRFLOW_API_SERVER_SERVICE_HOST=172.20.254.149
   AIRFLOW_IMAGE_TYPE=prod
   AIRFLOW_PYTHON_VERSION=3.12.12
   AIRFLOW_INSTALLATION_METHOD=apache-airflow
   AIRFLOW_API_SERVER_SERVICE_PORT_AIRFLOW_UI=8080
   AIRFLOW__SCHEDULER__CREATE_CRON_DATA_INTERVALS=True
   AIRFLOW_API_SERVER_PORT_8080_TCP_PORT=8080
   AIRFLOW_PGBOUNCER_PORT_6543_TCP=tcp://172.20.108.237:6543
   AIRFLOW_PGBOUNCER_PORT_9127_TCP_PROTO=tcp
   AIRFLOW__CORE__TEST_CONNECTION=Enabled
   AIRFLOW_UID=50000
   AIRFLOW_STATSD_PORT=udp://172.20.167.106:9125
   AIRFLOW_PGBOUNCER_SERVICE_PORT=6543
   AIRFLOW_STATSD_PORT_9125_UDP=udp://172.20.167.106:9125
   AIRFLOW_AUTH_ROLES_MAPPING={"app_airflow_prod_admin": [ "Admin" ], 
"app_airflow_prod_user": [ "User" ], "app_airflow_prod_viewer": [ "Viewer" ]}
   ```
   
   `airflow.cfg`
   
   ```
   [api]
   enable_proxy_fix = True
   log_config = 
   
   [celery]
   flower_url_prefix = 
   worker_concurrency = 16
   
   [celery_kubernetes_executor]
   kubernetes_queue = kubernetes
   
   [core]
   auth_manager = 
airflow.providers.fab.auth_manager.fab_auth_manager.FabAuthManager
   colored_console_log = False
   dags_folder = /opt/airflow/dags
   default_task_retries = 2
   execution_api_server_url = http://airflow-api-server:8080/execution/
   executor = KubernetesExecutor
   load_examples = False
   min_serialized_dag_fetch_interval = 300
   min_serialized_dag_update_interval = 300
   remote_logging = True
   
   [dag_processor]
   min_file_process_interval = 300
   parsing_processes = 1
   print_stats_interval = 300
   refresh_interval = 300
   stale_bundle_cleanup_min_versions = 2
   
   [elasticsearch]
   json_format = True
   log_id_template = {dag_id}_{task_id}_{execution_date}_{try_number}
   
   [elasticsearch_configs]
   max_retries = 3
   retry_timeout = True
   timeout = 30
   
   [email]
   default_email_on_failure = False
   default_email_on_retry = False
   
   [fab]
   enable_proxy_fix = True
   
   [kerberos]
   ccache = /var/kerberos-ccache/cache
   keytab = /etc/airflow.keytab
   principal = [email protected]
   reinit_frequency = 3600
   
   [kubernetes]
   airflow_configmap = airflow-config
   airflow_local_settings_configmap = airflow-config
   multi_namespace_mode = False
   namespace = airflow
   pod_template_file = /opt/airflow/pod_templates/pod_template_file.yaml
   worker_container_repository = docker-hub/analytics-airflow
   worker_container_tag = 0.0.342
   
   [kubernetes_executor]
   airflow_configmap = airflow-config
   airflow_local_settings_configmap = airflow-config
   delete_worker_pods_on_failure = True
   multi_namespace_mode = False
   namespace = airflow
   pod_template_file = /opt/airflow/pod_templates/pod_template_file.yaml
   worker_container_repository = docker-hub/analytics-airflow
   worker_container_tag = 0.0.342
   
   [logging]
   colored_console_log = False
   logging_level = INFO
   remote_log_conn_id = tu-s3-logs
   remote_logging = True
   
   [metrics]
   statsd_host = airflow-statsd
   statsd_on = True
   statsd_port = 9125
   statsd_prefix = airflow
   
   [scheduler]
   run_duration = 41460
   standalone_dag_processor = True
   statsd_host = airflow-statsd
   statsd_on = True
   statsd_port = 9125
   statsd_prefix = airflow
   ```
   
   
   ### Anything else?
   
   If I missed something please point that I will update the docs
   Could you please highlight areas that I could start investigation?
   
   Thank you for your contribution I see a lot of good work has been put in 
Airflow 3!
   
   ### Are you willing to submit PR?
   
   - [x] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [x] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to