pengchen created AIRFLOW-2642:
---------------------------------

             Summary: [kubernetes executor worker] the value of git-sync init 
container ENV GIT_SYNC_ROOT is wrong
                 Key: AIRFLOW-2642
                 URL: https://issues.apache.org/jira/browse/AIRFLOW-2642
             Project: Apache Airflow
          Issue Type: Bug
          Components: contrib
    Affects Versions: 2.0.0, 1.10
            Reporter: pengchen
            Assignee: pengchen
             Fix For: 1.10


There are two way of syncing dags, pvc and git-sync. When we use git-sync this 
way, the generated worker pod yaml file fragment is as follows

 
{code:java}
worker container:
-------------------------------
containers:
- args:
- airflow run tutorial1 print_date 2018-06-19T07:57:15.011693+00:00 --local -sd
/root/airflow/dags/dags/example_dags/tutorial1.py
command:
- bash
- -cx
- --
env:
- name: AIRFLOW__CORE__AIRFLOW_HOME
value: /root/airflow
- name: AIRFLOW__CORE__EXECUTOR
value: LocalExecutor
- name: AIRFLOW__CORE__DAGS_FOLDER
value: /tmp/dags
- name: SQL_ALCHEMY_CONN
valueFrom:
secretKeyRef:
key: sql_alchemy_conn
name: airflow-secrets

init container:
-------------------------------
initContainers:
- env:
- name: GIT_SYNC_REPO
value: https://code.devops.xiaohongshu.com/pengchen/Airflow-DAGs.git
- name: GIT_SYNC_BRANCH
value: master
- name: GIT_SYNC_ROOT
value: /tmp
- name: GIT_SYNC_DEST
value: dags
- name: GIT_SYNC_ONE_TIME
value: "true"
- name: GIT_SYNC_USERNAME
value: XXX
- name: GIT_SYNC_PASSWORD
value: XXX
image: library/git-sync-amd64:v2.0.5
imagePullPolicy: IfNotPresent
name: git-sync-clone
resources: {}
securityContext:
runAsUser: 0
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /root/airflow/dags/
name: airflow-dags
- mountPath: /root/airflow/logs
name: airflow-logs
- mountPath: /root/airflow/airflow.cfg
name: airflow-config
readOnly: true
subPath: airflow.cfg
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: default-token-xz87t
readOnly: true
{code}
According to the configuration, git-sync will synchronize dags to /tmp/dags 
directory. However the worker container command args(airflow run tutorial1 
print_date 2018-06-19T07:57:15.011693+00:00 --local -sd
/root/airflow/dags/dags/example_dags/tutorial1.py) are generated by the 
scheduler. Therefore, the task error is as follows
{code:java}
+ airflow run tutorial1 print_date 2018-06-19T07:57:15.011693+00:00 --local -sd 
/root/airflow/dags/dags/example_dags/tutorial1.py
[2018-06-19 07:57:29,075] {settings.py:174} INFO - setting.configure_orm(): 
Using pool settings. pool_size=5, pool_recycle=1800
[2018-06-19 07:57:29,232] {__init__.py:51} INFO - Using executor LocalExecutor
[2018-06-19 07:57:29,373] {models.py:219} INFO - Filling up the DagBag from 
/root/airflow/dags/dags/example_dags/tutorial1.py
[2018-06-19 07:57:29,648] {models.py:310} INFO - File 
/usr/local/lib/python2.7/dist-packages/airflow/example_dags/__init__.py assumed 
to contain no DAGs. Skipping.
Traceback (most recent call last):
File "/usr/local/bin/airflow", line 32, in <module>
args.func(args)
File "/usr/local/lib/python2.7/dist-packages/airflow/utils/cli.py", line 74, in 
wrapper
return f(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/airflow/bin/cli.py", line 475, in 
run
dag = get_dag(args)
File "/usr/local/lib/python2.7/dist-packages/airflow/bin/cli.py", line 146, in 
get_dag
'parse.'.format(args.dag_id))
airflow.exceptions.AirflowException: dag_id could not be found: tutorial1. 
Either the dag did not exist or it failed to parse.
{code}
 

The log shows that the worker cannot find the corresponding dag, so I think the 
environment variable GIT_SYNC_ROOT should be consistent with 
dag_volume_mount_path.  

The worker's environment variable AIRFLOW__CORE__DAGS_FOLDER is invalid

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to