aimran-adroll opened a new issue #10646:
URL: https://github.com/apache/airflow/issues/10646


   **Apache Airflow version**: 1.10.12, master
   
   
   **Kubernetes version: v1.17.9-eks-4c6976 (server)/ v.1.18.6 (client)
   
   **Environment**:
   
   - **Cloud provider or hardware configuration**: EKS
   - **OS** (e.g. from /etc/os-release): 
   - **Kernel** (e.g. `uname -a`):
   - **Install tools**:
   - **Others**:
   
   **What happened**: 
   
   Current logic of setting `dags_volume_subpath` is broken for the following 
use case:
   
   > Dag loaded from PVC but gitSync disabled.
   
   
   I am using the chart from apache/airflow master thusly: 
   
   ```
   helm install airflow chart --namespace airflow-dev \
   --set dags.persistence.enabled=true \
   --set dags.persistence.existingClaim=airflow-dag-pvc \
   --set dags.gitSync.enabled=false
   ```
   
   For the longest time, even with a vanilla install, the workers kept dying. 
Tailing the logs clued me in that workers were not able to find the dag. I 
verified from the scheduler that dags were present (it showed up in the UI etc)
   
   Further debugging (looking at the worker pod config) was the main clue ... 
here is the volumemount
   
   ```yaml
       - mountPath: /opt/airflow/dags
         name: airflow-dags
         readOnly: true
         subPath: repo/tests/dags
   ```
   
   Why/who would add `repo/tests/dags` as a subpath?? 🤦‍♂️ 
   
   Finally found the problem logic here:
   
https://github.com/apache/airflow/blob/9b2efc6dcc298e3df4d1365fe809ea1dc0697b3b/chart/values.yaml#L556
   
   Note the implied connection between `dags.persistence.enabled` and 
`dags.gitSync`! This looks like some leftover code from when gitsync and 
external dag went hand in hand. 
   
   Ideally, an user should be able to use a PVC _without_ using git sync 
   
   
   
   
   
   <!-- (please include exact error messages if you can) -->
   
   **What you expected to happen**: 
   I should be able to use a PVC without gitsync logic messing up my mount path
   
   <!-- What do you think went wrong? -->
   
   See above. 
   
   **How to reproduce it**:
   I am kinda surprised that this has not bitten anyone yet. I'd like to think 
my example is essentially a `hello world` of helm chart with an dag from a PVC.
   
   
   **Anything else we need to know**:
   
   This is the patch that worked for me. Seems fairly reasonable -- only muck 
with dags_volume_subpath if gitSync is enabled.
   
   Even better would be to audit other code and clearly separate out the 
different competing use case
   
   1. use PVC but no gitsync
   2. use PVC with gitsync
   3. use gitsync without PVC
   
   ```
   diff --git a/chart/values.yaml b/chart/values.yaml
   index 00832b435..8f6506cd4 100644
   --- a/chart/values.yaml
   +++ b/chart/values.yaml
   @@ -550,10 +550,10 @@ config:
        delete_worker_pods: 'True'
        run_as_user: '{{ .Values.uid }}'
        fs_group: '{{ .Values.gid }}'
        dags_volume_claim: '{{- if .Values.dags.persistence.enabled }}{{ 
include "airflow_dags_volume_claim" . }}{{ end }}'
   -    dags_volume_subpath: '{{- if .Values.dags.persistence.enabled 
}}{{.Values.dags.gitSync.dest }}/{{ .Values.dags.gitSync.subPath }}{{ end }}'
   +    dags_volume_subpath: '{{- if .Values.dags.gitSync.enabled 
}}{{.Values.dags.gitSync.dest }}/{{ .Values.dags.gitSync.subPath }}{{ end }}'
        git_repo: '{{- if and .Values.dags.gitSync.enabled (not 
.Values.dags.persistence.enabled) }}{{ .Values.dags.gitSync.repo }}{{ end }}'
        git_branch: '{{ .Values.dags.gitSync.branch }}'
        git_sync_rev: '{{ .Values.dags.gitSync.rev }}'
   ```
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to