4406arthur opened a new issue #11744:
URL: https://github.com/apache/airflow/issues/11744


   <!--
   
   Welcome to Apache Airflow!  For a smooth issue process, try to answer the 
following questions.
   Don't worry if they're not all applicable; just try to include what you can 
:-)
   
   If you need to include code snippets or logs, please put them in fenced code
   blocks.  If they're super-long, please use the details tag like
   <details><summary>super-long log</summary> lots of stuff </details>
   
   Please delete these comment blocks before submitting the issue.
   
   -->
   
   <!--
   
   IMPORTANT!!!
   
   PLEASE CHECK "SIMILAR TO X EXISTING ISSUES" OPTION IF VISIBLE
   NEXT TO "SUBMIT NEW ISSUE" BUTTON!!!
   
   PLEASE CHECK IF THIS ISSUE HAS BEEN REPORTED PREVIOUSLY USING SEARCH!!!
   
   Please complete the next sections or the issue will be closed.
   These questions are the first thing we need to know to understand the 
context.
   
   -->
   
   **Apache Airflow version**: v1.10.12
   
   
   **Kubernetes version** : v1.18.2
   
   **Environment**:
   
   - **Cloud provider or hardware configuration**:  on-premise
   - **OS** (e.g. from /etc/os-release): Red Hat 7.8 
   - **Kernel** (e.g. `uname **-a`):** 3.10
   - **Install tools**: helm stable/airflow chart  version 7.13.0
   
   **What happened**:
   
   The dags pod describe details as below,  init container did not clone the 
repo, when I logs init container just show a git usage guide
    
   ```yaml
   
   Name:         avmpodetlrealnesetl-994b62e3a5c94be8a6c69d466064fa6e
   Namespace:    airflow
   Priority:     0
   Node:         mlaas-k8s-worker-1/10.240.245.75
   Start Time:   Thu, 22 Oct 2020 05:21:08 -0400
   Labels:       airflow-worker=a55d639e-8c38-40ae-9851-b9710c60b2fd
                 airflow_version=1.10.12
                 dag_id=avm_pod_etl
                 execution_date=2020-10-22T09_20_55.288604_plus_00_00
                 kubernetes_executor=True
                 task_id=real_nes_etl
                 try_number=1
   Annotations:  <none>
   Status:       Failed
   IP:           10.233.103.87
   IPs:
     IP:  10.233.103.87
   Init Containers:
     git-sync-clone:
       Container ID:   
docker://5fa243b6c27f6734835e78612026d17d9a62454e35c51cf1fb9db4ff664994b7
       Image:          private-harbor:8080/library/git:latest
       Image ID:       
docker-pullable://private-harbor:8080/library/git@sha256:18d268a6d938f513040674b38d6ea2484d2384aa6904cb8d9a96f7a5e8304ca7
       Port:           <none>
       Host Port:      <none>
       State:          Terminated
         Reason:       Completed
         Exit Code:    0
         Started:      Thu, 22 Oct 2020 05:21:12 -0400
         Finished:     Thu, 22 Oct 2020 05:21:12 -0400
       Ready:          True
       Restart Count:  0
       Environment:
         GIT_SYNC_REPO:      ssh://xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
         GIT_SYNC_BRANCH:    master
         GIT_SYNC_ROOT:      /git
         GIT_SYNC_DEST:      repo
         GIT_SYNC_DEPTH:     1
         GIT_SYNC_ONE_TIME:  true
         GIT_SYNC_REV:       
         GIT_SSH_KEY_FILE:   /etc/git-secret/ssh
         GIT_SYNC_ADD_USER:  true
         GIT_SYNC_SSH:       true
         GIT_KNOWN_HOSTS:    false
       Mounts:
         /etc/git-secret/ssh from git-sync-ssh-key (rw,path="ssh")
         /git from airflow-dags (rw)
         /var/run/secrets/kubernetes.io/serviceaccount from airflow-token-9qssw 
(ro)
   Containers:
     base:
       Container ID:  
docker://9f23dea94ba7aad0674f54e3b35969d84b084e26942aa14da886a8a43646c039
       Image:         private-harbor:8080/library/airflow:1.10.12-python3.6
       Image ID:      
docker-pullable://private-harbor:8080/library/airflow@sha256:9ea9e5ca66bd17632241889ab248fe3852c9f3c830ed299a8ecaa8a13ac2082f
       Port:          <none>
       Host Port:     <none>
       Command:
         airflow
         run
         avm_pod_etl
         real_nes_etl
         2020-10-22T09:20:55.288604+00:00
         --local
         --pool
         default_pool
         -sd
         /opt/airflow/dags/avm_dag.py
       State:          Terminated
         Reason:       Error
         Exit Code:    1
         Started:      Thu, 22 Oct 2020 05:21:15 -0400
         Finished:     Thu, 22 Oct 2020 05:21:27 -0400
       Ready:          False
       Restart Count:  0
       Environment Variables from:
         airflow-env  ConfigMap  Optional: false
       Environment:
         AIRFLOW__CORE__DAGS_FOLDER:       /opt/airflow/dags/repo/
         AIRFLOW__CORE__EXECUTOR:          LocalExecutor
         AIRFLOW__CORE__SQL_ALCHEMY_CONN:  
postgresql+psycopg2://airflow:[email protected]:5432/airflow
       Mounts:
         /opt/airflow/dags from airflow-dags (ro)
         /opt/airflow/logs from airflow-logs (rw)
         /var/run/secrets/kubernetes.io/serviceaccount from airflow-token-9qssw 
(ro)
   Conditions:
     Type              Status
     Initialized       True 
     Ready             False 
     ContainersReady   False 
     PodScheduled      True 
   Volumes:
     airflow-dags:
       Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
       Medium:     
       SizeLimit:  <unset>
     airflow-logs:
       Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
       Medium:     
       SizeLimit:  <unset>
     git-sync-ssh-key:
       Type:        Secret (a volume populated by a Secret)
       SecretName:  airflow-secret
       Optional:    false
     airflow-token-9qssw:
       Type:        Secret (a volume populated by a Secret)
       SecretName:  airflow-token-9qssw
       Optional:    false
   QoS Class:       BestEffort
   
   ```
   
   **What you expected to happen**:
   
   The init git container lost the command and args parts.
   
   **How to reproduce it**:
   <!---
   
   As minimally and precisely as possible. Keep in mind we do not have access 
to your cluster or dags.
   
   If you are using kubernetes, please attempt to recreate the issue using 
minikube or kind.
   
   ## Install minikube/kind
   
   - Minikube https://minikube.sigs.k8s.io/docs/start/
   - Kind https://kind.sigs.k8s.io/docs/user/quick-start/
   
   If this is a UI bug, please provide a screenshot of the bug or a link to a 
youtube video of the bug in action
   
   You can include images using the .md style of
   ![alt text](http://url/to/img.png)
   
   To record a screencast, mac users can use QuickTime and then create an 
unlisted youtube video with the resulting .mov file.
   
   --->
   
   I share my config, maybe just a config issue
   
   ```
   ###################################
   # Airflow - Common Configs
   ###################################
   airflow:
     ## configs for the docker image of the web/scheduler/worker
     image:
       repository: private-harbor:8080/library/airflow
       tag: 1.10.12-python3.6
       ## values: Always or IfNotPresent
       pullPolicy: IfNotPresent
       pullSecret: ""
   
     executor: KubernetesExecutor
   
     fernetKey: "7T512UXSSmBOkpWimFHIVb8jK6lfmSAvx4mO6Arehnc="
   
     config:
       AIRFLOW__WEBSERVER__ENABLE_PROXY_FIX: "True"
       AIRFLOW__CORE__LOAD_EXAMPLES: "True"
       AIRFLOW__KUBERNETES__WORKER_CONTAINER_REPOSITORY: 
private-harbor:8080/library/airflow
       AIRFLOW__KUBERNETES__WORKER_CONTAINER_TAG: 1.10.12-python3.6
       AIRFLOW__KUBERNETES__GIT_SYNC_CONTAINER_REPOSITORY: 
private-harbor:8080/library/git
       AIRFLOW__KUBERNETES__GIT_SYNC_CONTAINER_TAG: latest
       AIRFLOW__KUBERNETES__GIT_REPO: "ssh://xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
       AIRFLOW__KUBERNETES__GIT_BRANCH: "master"
       AIRFLOW__KUBERNETES__GIT_SSH_KEY_SECRET_NAME: "airflow-secret"
       AIRFLOW__KUBERNETES__GIT_DAGS_FOLDER_MOUNT_POINT: "/opt/airflow/dags"
       AIRFLOW__KUBERNETES__RUN_AS_USER: "50000"
       AIRFLOW__KUBERNETES__DELETE_WORKER_PODS: "False"
       AIRFLOW__KUBERNETES__DAGS_IN_IMAGE: "False"
   
   ###################################
   # Airflow - Scheduler Configs
   ###################################
   scheduler:
   
     resources: {}
   
     ## the nodeSelector configs for the scheduler Pods
     ##
     nodeSelector: {}
   
     ## the affinity configs for the scheduler Pods
     ##
     affinity: {}
   
     ## the toleration configs for the scheduler Pods
     ##
     tolerations: []
   
     ## the security context for the scheduler Pods
     ##
     securityContext: {}
   
     ## labels for the scheduler Deployment
     ##
     labels: {}
   
     ## Pod labels for the scheduler Deployment
     ##
     podLabels: {}
   
     ## annotations for the scheduler Deployment
     ##
     annotations: {}
   
     ## Pod Annotations for the scheduler Deployment
     ##
     podAnnotations: {}
   
     ## if we should tell Kubernetes Autoscaler that its safe to evict these 
Pods
     ##
     safeToEvict: true
   
     ## configs for the PodDisruptionBudget of the scheduler
     ##
     podDisruptionBudget:
       ## if a PodDisruptionBudget resource is created for the scheduler
       ##
       enabled: true
   
       ## the maximum unavailable pods/percentage for the scheduler
       ##
       ## NOTE:
       ## - as there is only ever a single scheduler Pod,
       ##   this must be 100% for Kubernetes to be able to migrate it
       ##
       maxUnavailable: "100%"
   
       ## the minimum available pods/percentage for the scheduler
       ##
       minAvailable: ""
   
     connections: []
   
     ## if `scheduler.connections` are deleted and re-added after each 
scheduler restart
     ##
     refreshConnections: true
   
   
     ## custom airflow variables for the airflow scheduler
     ##
     ## NOTE:
     ## - THIS IS A STRING, containing a JSON object, with your variables in it
     ##
     ## EXAMPLE:
     ##   variables: |
     ##     { "environment": "dev" }
     ##
     variables: |
       {}
   
     ## custom airflow pools for the airflow scheduler
     ##
     ## NOTE:
     ## - THIS IS A STRING, containing a JSON object, with your pools in it
     ##
     ## EXAMPLE:
     ##   pools: |
     ##     {
     ##       "example": {
     ##         "description": "This is an example pool with 2 slots.",
     ##         "slots": 2
     ##       }
     ##     }
     ##
     pools: |
       {}
   
     ## the value of the `airflow --num_runs` parameter used to run the airflow 
scheduler
     ##
     ## NOTE:
     ## - this is the number of 'dag refreshes' before the airflow scheduler 
process will exit
     ## - if not set to `-1`, the scheduler Pod will restart regularly
     ## - for most environments, `-1` will be an acceptable value
     ##
     numRuns: -1
   
     ## if we run `airflow initdb` when the scheduler starts
     ##
     initdb: true
   
     ## if we run `airflow initdb` inside a special initContainer
     ##
     ## NOTE:
     ## - may be needed if you have custom database hooks configured that will 
be pulled in by git-sync
     ##
     preinitdb: false
   
     ## the number of seconds to wait (in bash) before starting the scheduler 
container
     ##
     initialStartupDelay: 0
   
     livenessProbe:
       enabled: true
       initialDelaySeconds: 300
       periodSeconds: 30
       failureThreshold: 5
   
   
   ###################################
   # Airflow - Worker Configs
   ###################################
   workers:
     ## if the airflow workers StatefulSet should be deployed
     ##
     enabled: false
   ###################################
   # Airflow - Flower Configs
   ###################################
   flower:
     ## if the Flower UI should be deployed
     ##
     ## NOTE:
     ## - only takes effect if `airflow.executor` is `CeleryExecutor`
     ##
     enabled: false
   ###################################
   # Airflow - Logs Configs
   ###################################
   logs:
     ## the airflow logs folder
     ##
     path: /opt/airflow/logs
   
     ## configs for the logs PVC
     ##
     persistence:
       ## if a persistent volume is mounted at `logs.path`
       ##
       enabled: false
   
       ## the name of an existing PVC to use
       ##
       existingClaim: ""
   
       ## sub-path under `logs.persistence.existingClaim` to use
       ##
       subPath: ""
   
       ## the name of the StorageClass used by the PVC
       ##
       ## NOTE:
       ## - if set to "", then `PersistentVolumeClaim/spec.storageClassName` is 
omitted
       ## - if set to "-", then `PersistentVolumeClaim/spec.storageClassName` 
is set to ""
       ##
       storageClass: ""
   
       ## the access mode of the PVC
       ##
       ## WARNING:
       ## - must be: `ReadWriteMany`
       ##
       ## NOTE:
       ## - different StorageClass support different access modes:
       ##   
https://kubernetes.io/docs/concepts/storage/persistent-volumes/#access-modes
       ##
       accessMode: ReadWriteMany
   
       ## the size of PVC to request
       ##
       size: 1Gi
   
   ###################################
   # Airflow - DAGs Configs
   ###################################
   dags:
     ## the airflow dags folder
     ##
     path: /opt/airflow/dags
   
     ## whether to disable pickling dags from the scheduler to workers
     ##
     ## NOTE:
     ## - sets AIRFLOW__CORE__DONOT_PICKLE
     ##
     doNotPickle: false
   
     ## install any Python `requirements.txt` at the root of `dags.path` 
automatically
     ##
     ## WARNING:
     ## - if set to true, and you are using `dags.git.gitSync`, you must also 
enable
     ## `dags.initContainer` to ensure the requirements.txt is available at Pod 
start
     ##
     installRequirements: false
   
     ## configs for the dags PVC
     ##
     persistence:
       ## if a persistent volume is mounted at `dags.path`
       ##
       enabled: false
   
       ## the name of an existing PVC to use
       ##
       existingClaim: ""
   
       ## sub-path under `dags.persistence.existingClaim` to use
       ##
       subPath: ""
   
       ## the name of the StorageClass used by the PVC
       ##
       ## NOTE:
       ## - if set to "", then `PersistentVolumeClaim/spec.storageClassName` is 
omitted
       ## - if set to "-", then `PersistentVolumeClaim/spec.storageClassName` 
is set to ""
       ##
       storageClass: ""
   
       ## the access mode of the PVC
       ##
       ## WARNING:
       ## - must be one of: `ReadOnlyMany` or `ReadWriteMany`
       ##
       ## NOTE:
       ## - different StorageClass support different access modes:
       ##   
https://kubernetes.io/docs/concepts/storage/persistent-volumes/#access-modes
       ##
       accessMode: ReadOnlyMany
   
       ## the size of PVC to request
       ##
       size: 1Gi
   
     ## configs for the DAG git repository & sync container
     ##
     git:
       ## url of the git repository
       ##
       ## EXAMPLE: (HTTP)
       ##   url: "https://github.com/torvalds/linux.git";
       ##
       ## EXAMPLE: (SSH)
       ##   url: "ssh://[email protected]:torvalds/linux.git"
       ##
       url: "ssh://xxxxxxxxxxxxxxxxxxxxxxxx"
   
       ## the branch/tag/sha1 which we clone
       ##
       ref: master
   
       ## the name of a pre-created secret containing files for ~/.ssh/
       ##
       ## NOTE:
       ## - this is ONLY RELEVANT for SSH git repos
       ## - the secret commonly includes files: id_rsa, id_rsa.pub, known_hosts
       ## - known_hosts is NOT NEEDED if `git.sshKeyscan` is true
       ##
       secret: "airflow-git-keys"
   
       ## if we should implicitly trust [git.repoHost]:git.repoPort, by auto 
creating a ~/.ssh/known_hosts
       ##
       ## WARNING:
       ## - setting true will increase your vulnerability ot a repo spoofing 
attack
       ##
       ## NOTE:
       ## - this is ONLY RELEVANT for SSH git repos
       ## - this is not needed if known_hosts is provided in `git.secret`
       ## - git.repoHost and git.repoPort ARE REQUIRED for this to work
       ##
       sshKeyscan: false
   
       ## the name of the private key file in your `git.secret`
       ##
       ## NOTE:
       ## - this is ONLY RELEVANT for PRIVATE SSH git repos
       ##
       privateKeyName: id_rsa
   
       ## the host name of the git repo
       ##
       ## NOTE:
       ## - this is ONLY REQUIRED for SSH git repos
       ##
       ## EXAMPLE:
       ##   repoHost: "github.com"
       ##
       repoHost: "10.240.245.11"
   
       ## the port of the git repo
       ##
       ## NOTE:
       ## - this is ONLY REQUIRED for SSH git repos
       ##
       repoPort: 22
   
       ## configs for the git-sync container
       ##
       gitSync:
         ## enable the git-sync sidecar container
         ##
         enabled: true
   
         ## resource requests/limits for the git-sync container
         ##
         ## NOTE:
         ## - when `workers.autoscaling` is true, YOU MUST SPECIFY a resource 
request
         ##
         ## EXAMPLE:
         ##   resources:
         ##     requests:
         ##       cpu: "50m"
         ##       memory: "64Mi"
         ##
         resources: {}
   
         ## the docker image for the git-sync container
         image:
           repository: private-harbor:8080/library/git
           tag: latest
           ## values: Always or IfNotPresent
           pullPolicy: Always
   
         ## the git sync interval in seconds
         ##
         refreshTime: 60
   
     ## configs for the git-clone container
     ##
     ## NOTE:
     ## - use this container if you want to only clone the external git repo
     ##   at Pod start-time, and not keep it synchronised afterwards
     ##
     initContainer:
       ## enable the git-clone sidecar container
       ##
       ## NOTE:
       ## - this is NOT required for the git-sync sidecar to work
       ## - this is mostly used for when `dags.installRequirements` is true to 
ensure that
       ##   requirements.txt is available at Pod start
       ##
       enabled: false
   
       ## resource requests/limits for the git-clone container
       ##
       ## EXAMPLE:
       ##   resources:
       ##     requests:
       ##       cpu: "50m"
       ##       memory: "64Mi"
       ##
       resources: {}
   
       ## the docker image for the git-clone container
       image:
         repository: private-harbor:8080/library/git 
         tag: latest
         ## values: Always or IfNotPresent
         pullPolicy: Always
   
       ## path to mount dags-data volume to
       ##
       ## WARNING:
       ## - this path is also used by the git-sync container
       ##
       mountPath: "/dags"
   
       ## sub-path under `dags.initContainer.mountPath` to sync dags to
       ##
       ## WARNING:
       ## - this path is also used by the git-sync container
       ## - this MUST INCLUDE the leading /
       ##
       ## EXAMPLE:
       ##   syncSubPath: "/subdirWithDags"
       ##
       syncSubPath: ""
   
   ###################################
   # Kubernetes - RBAC
   ###################################
   rbac:
     ## if Kubernetes RBAC resources are created
     ##
     ## NOTE:
     ## - these allow the service account to create/delete Pods in the airflow 
namespace,
     ##   which is required for the KubernetesPodOperator() to function
     ##
     create: true
   
     ## if the created RBAC Role has GET/LIST on Event resources
     ##
     ## NOTE:
     ## - this is needed for KubernetesPodOperator() to use 
`log_events_on_failure=True`
     ##
     events: false
   
   ###################################
   # Kubernetes - Service Account
   ###################################
   serviceAccount:
     ## if a Kubernetes ServiceAccount is created
     ##
     ## NOTE:
     ## - if false, you must create the service account outside of this helm 
chart,
     ##   with the name: `serviceAccount.name`
     ##
     create: true
   
     ## the name of the ServiceAccount
     ##
     ## NOTE:
     ## - by default the name is generated using the 
`airflow.serviceAccountName` template in `_helpers.tpl`
     ##
     name: ""
   
     ## annotations for the ServiceAccount
     ##
     ## EXAMPLE: (to use WorkloadIdentity in Google Cloud)
     ##   annotations:
     ##     iam.gke.io/gcp-service-account: 
<<GCP_SERVICE>>@<<GCP_PROJECT>>.iam.gserviceaccount.com
     ##
     annotations: {}
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to