carlos54 opened a new issue, #63527:
URL: https://github.com/apache/airflow/issues/63527

   ### Apache Airflow version
   
   3.1.8
   
   ### If "Other Airflow 3 version" selected, which one?
   
   _No response_
   
   ### What happened?
   
   The DagBundleProcessor fails and terminates with an AttributeError: 
'NoneType' object has no attribute 'last_refreshed'.
   
   This occurs in airflow/dag_processing/manager.py during the 
_refresh_dag_bundles loop. When a DAG bundle is newly configured, the call to 
session.get(DagBundleModel, bundle.name) returns None because the bundle record 
hasn't been persisted in the database yet (or hasn't been synchronized). The 
code immediately attempts to access .last_refreshed on this None object, 
causing the process to crash.
   
   
   ```
   File 
"/usr/local/lib/python3.12/site-packages/airflow/dag_processing/manager.py", 
line 565, in _refresh_dag_bundles
       now - (bundle_model.last_refreshed or utc_epoch())
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^
   AttributeError: 'NoneType' object has no attribute 'last_refreshed'
   2026-03-13T13:10:11.638968Z [debug    ] Disposing DB connection pool (PID 
64) [airflow.settings] loc=settings.py:539
   command terminated with non-zero exit code: exit status 1
   ```
   
   ### What you think should happen instead?
   
   The DagBundlesManager should be resilient to missing database records. If 
bundle_model is None, the processor should either:
   
   Skip the refresh for that specific bundle in the current iteration and log a 
warning.
   
   Or, more robustly, initialize a new DagBundleModel instance for that bundle 
to allow the process to continue and persist the record during the session 
commit.
   
   Accessing attributes on the result of a database fetch must be guarded by a 
null check to ensure idempotency and system stability.
   
   
   
   ### How to reproduce
   
   se Apache Airflow 3.1.8 (or any version implementing AIP-66 DAG Bundles).
   
   Add a new DAG Bundle configuration in airflow.cfg or via environment 
variables (e.g., AIRFLOW__DAG_BUNDLES__MY_NEW_BUNDLE__TYPE=git).
   
   Start the Airflow dag-processor.
   
   The processor will crash immediately during its first scan cycle when it 
tries to calculate the elapsed_time_since_refresh for the newly detected bundle 
before the sync_bundles_to_db has finalized the entry.
   
   
   ````
   # manager.py L563 (approx)
   with create_session() as session:
       bundle_model: DagBundleModel | None = session.get(DagBundleModel, 
bundle.name)
       # CRASH HERE: bundle_model is None
       elapsed_time_since_refresh = (
           now - (bundle_model.last_refreshed or utc_epoch())
       ).total_seconds()
   ````
   
   Suggest fix L563 :
   ```
   bundle_model: DagBundleModel | None = session.get(DagBundleModel, 
bundle.name)
   # --- PATCH BEGIN ---
   if bundle_model is None:
       self.log.warning(
           "DagBundleModel not found", 
           bundle.name
       )
       bundle_model = DagBundleModel(name=bundle.name)
       session.add(bundle_model)
   # --- PATCH BEGIN END  ---
   ```
   
   ### Operating System
   
   RHEL 9.6 (Plow)
   
   ### Versions of Apache Airflow Providers
   
         apache-airflow==3.1.8 \
         structlog==25.5.0 \
         psycopg2-binary==2.9.11 \
         asyncpg==0.31.0 \
         apache-airflow-providers-fab==3.4.0 \
         apache-airflow-providers-redis==4.4.2 \
         apache-airflow-providers-git==0.2.4 \
         apache-airflow-providers-cncf-kubernetes==10.13.0 \
         apache-airflow-providers-smtp==2.4.2 \
         flask-limiter==3.12 \
         redis==6.4.0 \
         authlib==1.6.9 \
         PyJWT==2.11.0 \
         cryptography==42.0.8 \
         requests==2.32.5
   
   ### Deployment
   
   Official Apache Airflow Helm Chart
   
   ### Deployment details
   
   Helm chart 1.18
   
   ### Anything else?
   
   [core]
   dags_folder = /opt/airflow/dags
   hostname_callable = airflow.utils.net.getfqdn
   might_contain_dag_callable = 
airflow.utils.file.might_contain_dag_via_default_heuristic
   default_timezone = Europe/Paris
   executor = KubernetesExecutor
   auth_manager = [CUSTOM_AUTH_MANAGER]
   simple_auth_manager_users = [REDACTED]
   simple_auth_manager_all_admins = False
   parallelism = 5
   max_active_tasks_per_dag = 2
   dags_are_paused_at_creation = True
   max_active_runs_per_dag = 1
   max_consecutive_failed_dag_runs_per_dag = 0
   load_examples = False
   plugins_folder = /opt/airflow/plugins
   fernet_key = [REDACTED]
   dagbag_import_timeout = 30.0
   default_impersonation =
   security =
   unit_test_mode = False
   killed_task_cleanup_time = 60
   dag_run_conf_overrides_params = True
   default_task_retries = 0
   default_task_retry_delay = 300
   max_task_retry_delay = 86400
   default_task_weight_rule = downstream
   task_success_overtime = 20
   min_serialized_dag_update_interval = 30
   compress_serialized_dags = False
   min_serialized_dag_fetch_interval = 10
   max_num_rendered_ti_fields_per_task = 30
   xcom_backend = airflow.sdk.execution_time.xcom.BaseXCom
   hide_sensitive_var_conn_fields = True
   default_pool_task_slot_count = 128
   max_map_length = 1024
   daemon_umask = 0o077
   test_connection = Disabled
   max_templated_field_length = 4096
   execution_api_server_url = [REDACTED]
   colored_console_log = False
   
   [database]
   alembic_ini_file_path = alembic.ini
   sql_alchemy_conn = [REDACTED_DB_CONNECTION_STRING]
   sql_engine_encoding = utf-8
   sql_alchemy_pool_enabled = true
   sql_alchemy_pool_size = 5
   sql_alchemy_max_overflow = 10
   sql_alchemy_pool_recycle = 1800
   sql_alchemy_pool_pre_ping = true
   max_db_retries = 3
   check_migrations = True
   migration_batch_size = 10000
   
   [logging]
   base_log_folder = /opt/airflow/logs
   remote_logging = false
   logging_level = ERROR
   fab_logging_level = ERROR
   colored_console_log = False
   log_format =
   simple_log_format = %(asctime)s %(levelname)s - %(message)s
   dag_processor_log_target = file
   log_filename_template = [REDACTED]
   
   [metrics]
   statsd_on = False
   statsd_host = [REDACTED]
   statsd_port = 9125
   statsd_prefix = airflow
   
   [api]
   enable_swagger_ui = True
   secret_key = [REDACTED]
   expose_config = False
   expose_stacktrace = true
   base_url = [REDACTED]
   host = 0.0.0.0
   port = 8080
   workers = 1
   worker_timeout = 120
   ssl_cert =
   ssl_key =
   maximum_page_limit = 100
   fallback_page_limit = 50
   hide_paused_dags_by_default = False
   page_size = 50
   auto_refresh_interval = 3
   secret_key_cmd = [REDACTED]
   
   [workers]
   min_heartbeat_interval = 5
   max_failed_heartbeats = 3
   
   [scheduler]
   job_heartbeat_sec = 5
   scheduler_heartbeat_sec = 5
   task_instance_heartbeat_sec = 0
   num_runs = -1
   scheduler_idle_sleep_time = 1
   catchup_by_default = False
   max_tis_per_query = 16
   use_row_level_locking = True
   max_dagruns_to_create_per_loop = 10
   max_dagruns_per_loop_to_schedule = 20
   use_job_schedule = True
   run_duration = 41460
   
   [kubernetes_executor]
   pod_template_file = [REDACTED]
   worker_container_repository = [REDACTED]
   worker_container_tag = [REDACTED]
   namespace = [REDACTED]
   delete_worker_pods = True
   in_cluster = True
   kube_client_request_args =
   enable_tcp_keepalive = True
   
   [webserver]
   enable_proxy_fix = True
   rbac = True
   
   [email]
   email_backend = airflow.utils.email.send_email_smtp
   email_conn_id = smtp_default
   default_email_on_retry = True
   default_email_on_failure = True
   smtp_host = [REDACTED]
   smtp_mail_from = [REDACTED]
   
   [ctie_auth]
   oidc_access_token_url = [REDACTED]
   oidc_api_base = [REDACTED]
   oidc_authorize_url = [REDACTED]
   oidc_metadata_url = [REDACTED]
   oidc_userinfo_endpoint = [REDACTED]
   redirect_url = [REDACTED]
   
   
[3.1.8_manager.py](https://github.com/user-attachments/files/25973421/3.1.8_manager.py)
   
   ### Are you willing to submit PR?
   
   - [x] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [x] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to