TejasMorbagal opened a new issue, #54550:
URL: https://github.com/apache/airflow/issues/54550

   ### Official Helm Chart version
   
   1.18.0 (latest released)
   
   ### Apache Airflow version
   
   3.0.4, 3.0.2
   
   ### Kubernetes Version
   
   1.31
   
   ### Helm Chart configuration
   
   airflow:
     defaultAirflowTag: "3.0.4-python3.12"
     enabled: true
     airflowVersion: "3.0.4"
     fullnameOverride: "airflow-production"
   
     executor: KubernetesExecutor
     allowPodLaunching: true
   
     apiServer:
       defaultUser:
         enabled: false
       replicas: 1
       serviceAccount:
         create: false
         name: airflow-sa
   #      annotations:
   #        eks.amazonaws.com/role-arn: arn:aws:iam::####:policy/s3_role
   
     redis:
       enabled: false
   
     postgresql:
       enabled: false
   
     env:
       - name: PYTHONPATH
         value: "/opt/airflow/dags:$PYTHONPATH"
   
     data:
       metadataSecretName: db-connection
   
     connectionsTemplates:
       ACCESS_KEY_ID:
         kind: secret
         name: aws-token
         key: AWS_ACCESS_KEY_ID
       SECRET_ACCESS_KEY:
         kind: secret
         name: aws-token
         key: AWS_SECRET_ACCESS_KEY
   
     config:
       scheduler:
         # Let heartbeats be up to 2–5 minutes old without declaring "dead"
        scheduler_health_check_threshold: 600
        parsing_processes: 2
       core:
          load_examples: "False"
   
       logging:
         remote_logging: "True"
         logging_level: "INFO"
         remote_log_conn_id: "s3_default"
         remote_base_log_folder: "s3://airflow/logs"
         encrypt_s3_logs: "False"
   
     ingress:
       apiServer:
         enabled: true
         ingressClassName: nginx
         annotations:
           cert-manager.io/cluster-issuer: letsencrypt
           nginx.ingress.kubernetes.io/enable-cors: "true"
         host: hostname
         path: /
         pathType: Prefix
         tls:
           enabled: true
           secretName: airflow-tls-secret
   
     workers:
       serviceAccount:
         create: false
         name: airflow-sa
         #      annotations:
         #        eks.amazonaws.com/role-arn: 
<ENTER_IAM_ROLE_ARN_CREATED_BY_EKSCTL_COMMAND>
       resources:
         requests:
           cpu: 200m
           memory: 2Gi
         limits:
           cpu: 500m
           memory: 5Gi
   
     scheduler:
       # Remote logging to S3 enabled, so disabled log groomer
   #    env:
   #      - name: AIRFLOW__CORE__HOSTNAME_CALLABLE
   #        value: socket.gethostname
       logGroomerSidecar:
         enabled: true
           # keep short because S3 has the long-term copy
         env:
             - name: RETENTION_DAYS
               value: "3"
         command:
           - bash
           - -ec
           - |
             echo "Cleaning logs every 900 seconds"
             while true; do
               echo "Trimming airflow logs to ${RETENTION_DAYS:-3} days."
               find /opt/airflow/logs -mindepth 1 -type f -mtime 
+${RETENTION_DAYS:-3} -print -delete || true
               find /opt/airflow/logs -mindepth 1 -type d -empty -print -delete 
|| true
               sleep 900
             done
   
       # ✅ Tweak startupProbe: give the scheduler time to emit its first 
heartbeat
       startupProbe:
   #      command: ["bash","-ec","airflow jobs check --job-type SchedulerJob 
--local"]
         command:
           - /bin/bash
           - -c
           - airflow jobs check --job-type SchedulerJob --local
         initialDelaySeconds: 30
         periodSeconds: 10
         timeoutSeconds: 30
         failureThreshold: 10   # loop handles retries
   
       # ✅ Liveness after startup is stable
       livenessProbe:
   #      command: ["bash","-ec","airflow jobs check --job-type SchedulerJob 
--local"]
         command:
           - /bin/bash
           - -c
           - airflow jobs check --job-type SchedulerJob --local
         failureThreshold: 3
         periodSeconds: 30
         timeoutSeconds: 30
       resources:
         requests:
           cpu: 2
           memory: 5Gi
         limits:
           cpu: 4
           memory: 10Gi
   
     dagProcessor:
       enabled: true
       resources:
         requests:
           cpu: 1
           memory: 10Gi
         limits:
           cpu: 3
           memory: 12Gi
   
     dags:
       gitSync:
         enabled: true
         repo: https://github.com/my-org/dags.git
         branch: main
         rev: HEAD
         depth: 1
         maxFailures: 0
         subPath: "dags"
         credentialsSecret: git-credentials
   
     triggerer:
       enabled: true
   
     migrateDatabaseJob:
       enabled: true
       applyCustomEnv: false
       useHelmHooks: false  # using Argo CD hooks instead of Helm hooks
       jobAnnotations:
         argocd.argoproj.io/hook: PreSync
         # Keep last hook object around until the next sync so you can inspect 
logs
         argocd.argoproj.io/hook-delete-policy: BeforeHookCreation
       ttlSecondsAfterFinished: null
   
     createUserJob:
       useHelmHooks: false
       applyCustomEnv: false
   
   
   ### Docker Image customizations
   
   No docker image customizations
   
   ### What happened
   
   1. Scheduler doesn't start and is stuck indefinitely, liveness probe fails 
eventually.
   2. All the other components run fine
   3. Also interesting fact is that the liveness probe command runs fine if 
manually executed in the pod's container
   
   ```
   airflow@airflow-aws-production-scheduler-85bf4f4747-9ckpz:/opt/airflow$ 
airflow jobs check --job-type SchedulerJob --local
   Found one alive job.
   ````
   Below is the scheduler pod log:
   ```
   k exec -it airflow-aws-production-scheduler-85bf4f4747-9ckpz /bin/bash
   kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future 
version. Use kubectl exec [POD] -- [COMMAND] instead.
   Defaulted container "scheduler" out of: scheduler, scheduler-log-groomer, 
wait-for-airflow-migrations (init)
   airflow@airflow-aws-production-scheduler-85bf4f4747-9ckpz:/opt/airflow$ 
airflow scheduler -v --stderr err.txt --stdout out.txt
   [2025-08-15T14:17:59.463+0000] {providers_manager.py:356} DEBUG - 
Initializing Providers Manager[config]
   [2025-08-15T14:17:59.465+0000] {providers_manager.py:356} DEBUG - 
Initializing Providers Manager[list]
   [2025-08-15T14:17:59.662+0000] {providers_manager.py:598} DEBUG - Loading 
EntryPoint(name='provider_info', 
value='airflow.providers.amazon.get_provider_info:get_provider_info', 
group='apache_airflow_provider') from package apache-airflow-providers-amazon
   [2025-08-15T14:17:59.675+0000] {providers_manager.py:598} DEBUG - Loading 
EntryPoint(name='provider_info', 
value='airflow.providers.celery.get_provider_info:get_provider_info', 
group='apache_airflow_provider') from package apache-airflow-providers-celery
   [2025-08-15T14:17:59.679+0000] {providers_manager.py:598} DEBUG - Loading 
EntryPoint(name='provider_info', 
value='airflow.providers.cncf.kubernetes.get_provider_info:get_provider_info', 
group='apache_airflow_provider') from package 
apache-airflow-providers-cncf-kubernetes
   [2025-08-15T14:17:59.683+0000] {providers_manager.py:598} DEBUG - Loading 
EntryPoint(name='provider_info', 
value='airflow.providers.common.compat.get_provider_info:get_provider_info', 
group='apache_airflow_provider') from package 
apache-airflow-providers-common-compat
   [2025-08-15T14:17:59.685+0000] {providers_manager.py:598} DEBUG - Loading 
EntryPoint(name='provider_info', 
value='airflow.providers.common.io.get_provider_info:get_provider_info', 
group='apache_airflow_provider') from package apache-airflow-providers-common-io
   [2025-08-15T14:17:59.687+0000] {providers_manager.py:598} DEBUG - Loading 
EntryPoint(name='provider_info', 
value='airflow.providers.common.messaging.get_provider_info:get_provider_info', 
group='apache_airflow_provider') from package 
apache-airflow-providers-common-messaging
   [2025-08-15T14:17:59.688+0000] {providers_manager.py:598} DEBUG - Loading 
EntryPoint(name='provider_info', 
value='airflow.providers.common.sql.get_provider_info:get_provider_info', 
group='apache_airflow_provider') from package 
apache-airflow-providers-common-sql
   [2025-08-15T14:17:59.690+0000] {providers_manager.py:598} DEBUG - Loading 
EntryPoint(name='provider_info', 
value='airflow.providers.docker.get_provider_info:get_provider_info', 
group='apache_airflow_provider') from package apache-airflow-providers-docker
   [2025-08-15T14:17:59.692+0000] {providers_manager.py:598} DEBUG - Loading 
EntryPoint(name='provider_info', 
value='airflow.providers.elasticsearch.get_provider_info:get_provider_info', 
group='apache_airflow_provider') from package 
apache-airflow-providers-elasticsearch
   [2025-08-15T14:17:59.694+0000] {providers_manager.py:598} DEBUG - Loading 
EntryPoint(name='provider_info', 
value='airflow.providers.fab.get_provider_info:get_provider_info', 
group='apache_airflow_provider') from package apache-airflow-providers-fab
   [2025-08-15T14:17:59.697+0000] {providers_manager.py:598} DEBUG - Loading 
EntryPoint(name='provider_info', 
value='airflow.providers.ftp.get_provider_info:get_provider_info', 
group='apache_airflow_provider') from package apache-airflow-providers-ftp
   [2025-08-15T14:17:59.699+0000] {providers_manager.py:598} DEBUG - Loading 
EntryPoint(name='provider_info', 
value='airflow.providers.git.get_provider_info:get_provider_info', 
group='apache_airflow_provider') from package apache-airflow-providers-git
   [2025-08-15T14:17:59.701+0000] {providers_manager.py:598} DEBUG - Loading 
EntryPoint(name='provider_info', 
value='airflow.providers.google.get_provider_info:get_provider_info', 
group='apache_airflow_provider') from package apache-airflow-providers-google
   [2025-08-15T14:17:59.713+0000] {providers_manager.py:598} DEBUG - Loading 
EntryPoint(name='provider_info', 
value='airflow.providers.grpc.get_provider_info:get_provider_info', 
group='apache_airflow_provider') from package apache-airflow-providers-grpc
   [2025-08-15T14:17:59.715+0000] {providers_manager.py:598} DEBUG - Loading 
EntryPoint(name='provider_info', 
value='airflow.providers.hashicorp.get_provider_info:get_provider_info', 
group='apache_airflow_provider') from package apache-airflow-providers-hashicorp
   [2025-08-15T14:17:59.717+0000] {providers_manager.py:598} DEBUG - Loading 
EntryPoint(name='provider_info', 
value='airflow.providers.http.get_provider_info:get_provider_info', 
group='apache_airflow_provider') from package apache-airflow-providers-http
   [2025-08-15T14:17:59.719+0000] {providers_manager.py:598} DEBUG - Loading 
EntryPoint(name='provider_info', 
value='airflow.providers.microsoft.azure.get_provider_info:get_provider_info', 
group='apache_airflow_provider') from package 
apache-airflow-providers-microsoft-azure
   [2025-08-15T14:17:59.724+0000] {providers_manager.py:598} DEBUG - Loading 
EntryPoint(name='provider_info', 
value='airflow.providers.mysql.get_provider_info:get_provider_info', 
group='apache_airflow_provider') from package apache-airflow-providers-mysql
   [2025-08-15T14:17:59.726+0000] {providers_manager.py:598} DEBUG - Loading 
EntryPoint(name='provider_info', 
value='airflow.providers.odbc.get_provider_info:get_provider_info', 
group='apache_airflow_provider') from package apache-airflow-providers-odbc
   [2025-08-15T14:17:59.727+0000] {providers_manager.py:598} DEBUG - Loading 
EntryPoint(name='provider_info', 
value='airflow.providers.openlineage.get_provider_info:get_provider_info', 
group='apache_airflow_provider') from package 
apache-airflow-providers-openlineage
   [2025-08-15T14:17:59.730+0000] {providers_manager.py:598} DEBUG - Loading 
EntryPoint(name='provider_info', 
value='airflow.providers.postgres.get_provider_info:get_provider_info', 
group='apache_airflow_provider') from package apache-airflow-providers-postgres
   [2025-08-15T14:17:59.732+0000] {providers_manager.py:598} DEBUG - Loading 
EntryPoint(name='provider_info', 
value='airflow.providers.redis.get_provider_info:get_provider_info', 
group='apache_airflow_provider') from package apache-airflow-providers-redis
   [2025-08-15T14:17:59.734+0000] {providers_manager.py:598} DEBUG - Loading 
EntryPoint(name='provider_info', 
value='airflow.providers.sendgrid.get_provider_info:get_provider_info', 
group='apache_airflow_provider') from package apache-airflow-providers-sendgrid
   [2025-08-15T14:17:59.736+0000] {providers_manager.py:598} DEBUG - Loading 
EntryPoint(name='provider_info', 
value='airflow.providers.sftp.get_provider_info:get_provider_info', 
group='apache_airflow_provider') from package apache-airflow-providers-sftp
   [2025-08-15T14:17:59.737+0000] {providers_manager.py:598} DEBUG - Loading 
EntryPoint(name='provider_info', 
value='airflow.providers.slack.get_provider_info:get_provider_info', 
group='apache_airflow_provider') from package apache-airflow-providers-slack
   [2025-08-15T14:17:59.739+0000] {providers_manager.py:598} DEBUG - Loading 
EntryPoint(name='provider_info', 
value='airflow.providers.smtp.get_provider_info:get_provider_info', 
group='apache_airflow_provider') from package apache-airflow-providers-smtp
   [2025-08-15T14:17:59.741+0000] {providers_manager.py:598} DEBUG - Loading 
EntryPoint(name='provider_info', 
value='airflow.providers.snowflake.get_provider_info:get_provider_info', 
group='apache_airflow_provider') from package apache-airflow-providers-snowflake
   [2025-08-15T14:17:59.743+0000] {providers_manager.py:598} DEBUG - Loading 
EntryPoint(name='provider_info', 
value='airflow.providers.ssh.get_provider_info:get_provider_info', 
group='apache_airflow_provider') from package apache-airflow-providers-ssh
   [2025-08-15T14:17:59.744+0000] {providers_manager.py:598} DEBUG - Loading 
EntryPoint(name='provider_info', 
value='airflow.providers.standard.get_provider_info:get_provider_info', 
group='apache_airflow_provider') from package apache-airflow-providers-standard
   [2025-08-15T14:17:59.746+0000] {providers_manager.py:359} DEBUG - 
Initialization of Providers Manager[list] took 0.28 seconds
   [2025-08-15T14:17:59.746+0000] {configuration.py:1871} DEBUG - Loading 
providers configuration
   [2025-08-15T14:17:59.768+0000] {providers_manager.py:359} DEBUG - 
Initialization of Providers Manager[config] took 0.30 seconds
     ____________       _____________
    ____    |__( )_________  __/__  /________      __
   ____  /| |_  /__  ___/_  /_ __  /_  __ \_ | /| / /
   ___  ___ |  / _  /   _  __/ _  / / /_/ /_ |/ |/ /
    _/_/  |_/_/  /_/    /_/    /_/  \____/____/|__/
   [2025-08-15T14:17:59.889+0000] {plugins_manager.py:353} DEBUG - Loading 
plugins
   [2025-08-15T14:17:59.889+0000] {plugins_manager.py:269} DEBUG - Loading 
plugins from directory: /opt/airflow/plugins
   [2025-08-15T14:17:59.889+0000] {plugins_manager.py:249} DEBUG - Loading 
plugins from entrypoints
   [2025-08-15T14:17:59.890+0000] {plugins_manager.py:252} DEBUG - Importing 
entry_point plugin openlineage
   [2025-08-15T14:18:00.165+0000] {plugins_manager.py:365} DEBUG - Loading 1 
plugin(s) took 275.59 seconds
   [2025-08-15T14:18:00.165+0000] {listener.py:37} DEBUG - Calling 
'on_starting' with {'component': <airflow.jobs.job.Job object at 
0x7fd2d27c2d50>}
   [2025-08-15T14:18:00.165+0000] {listener.py:38} DEBUG - Hook impls: []
   [2025-08-15T14:18:00.166+0000] {listener.py:42} DEBUG - Result from 
'on_starting': []
   [2025-08-15T14:18:00.184+0000] {scheduler_job_runner.py:996} INFO - Starting 
the scheduler
   [2025-08-15T14:18:00.185+0000] {scheduler_job_runner.py:1006} DEBUG - Using 
DatabaseCallbackSink as callback sink.
   [2025-08-15T14:18:00.185+0000] {executor_loader.py:257} DEBUG - Loading 
executor :KubernetesExecutor: from core
   ```
   
   ### What you think should happen instead
   
   _No response_
   
   ### How to reproduce
   
   Install the helm chart 1.18.0 with KubernetesExecutor 
   
   ### Anything else
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [x] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to