ZackingIt opened a new issue #18025:
URL: https://github.com/apache/airflow/issues/18025


   ### Apache Airflow version
   
   2.1.3 (latest released)
   
   ### Operating System
   
   Ubuntu VERSION="16.04.6 LTS (Xenial Xerus)"
   
   ### Versions of Apache Airflow Providers
   
   2.1.3
   
   ### Deployment
   
   Docker-Compose
   
   ### Deployment details
   
   My docker-compose.yaml is identical to the default dev box example except I 
mount a few additional volumes:
   
   ```
     volumes:
       - ./dags:/opt/airflow/dags
       - ./logs:/opt/airflow/logs
       - ./plugins:/opt/airflow/plugins
       - ./kubernetes_configs:/opt/airflow/kubernetes_configs
       - ./airflow.cfg:/opt/airflow/airflow.cfg
   ```
   
   ### What happened
   
   KubernetesPodOperator seems to be *opening* the config file at the location, 
and then providing the opened file contents as the "file path", rather than 
simply providing the generated string which is the composition of the 
template_path and the additional config_file path.
   
   ```
   AIRFLOW_CTX_DAG_RUN_ID=manual__2021-09-03T20:22:20.691296+00:00
   [2021-09-03 20:22:23,803] {taskinstance.py:1462} ERROR - Task failed with 
exception
   Traceback (most recent call last):
     File 
"/home/airflow/.local/lib/python3.6/site-packages/airflow/models/taskinstance.py",
 line 1164, in _run_raw_task
       self._prepare_and_execute_task_with_callbacks(context, task)
     File 
"/home/airflow/.local/lib/python3.6/site-packages/airflow/models/taskinstance.py",
 line 1282, in _prepare_and_execute_task_with_callbacks
       result = self._execute_task(context, task_copy)
     File 
"/home/airflow/.local/lib/python3.6/site-packages/airflow/models/taskinstance.py",
 line 1312, in _execute_task
       result = task_copy.execute(context=context)
     File 
"/home/airflow/.local/lib/python3.6/site-packages/airflow/providers/cncf/kubernetes/operators/kubernetes_pod.py",
 line 336, in execute
       config_file=self.config_file,
     File 
"/home/airflow/.local/lib/python3.6/site-packages/airflow/kubernetes/kube_client.py",
 line 145, in get_kube_client
       client_conf = _get_kube_config(in_cluster, cluster_context, config_file)
     File 
"/home/airflow/.local/lib/python3.6/site-packages/airflow/kubernetes/kube_client.py",
 line 46, in _get_kube_config
       load_kube_config(client_configuration=cfg, config_file=config_file, 
context=cluster_context)
     File 
"/home/airflow/.local/lib/python3.6/site-packages/airflow/kubernetes/refresh_config.py",
 line 123, in load_kube_config
       loader = _get_kube_config_loader_for_yaml_file(config_file, 
active_context=context, config_persister=None)
     File 
"/home/airflow/.local/lib/python3.6/site-packages/airflow/kubernetes/refresh_config.py",
 line 105, in _get_kube_config_loader_for_yaml_file
       with open(filename) as f:
   FileNotFoundError: [Errno 2] No such file or directory: 'apiVersion: 
batch/v1beta1\nkind: CronJob\nmetadata:\n  name: zek-01\nspec:\n  schedule: "0 
*/1 * * *"\n  concurrencyPolicy: Forbid\n  jobTemplate:\n    spec:\n      
template:\n        metadata:\n          labels:\n            
networkpolicy.xandr.com/unrestrictedEgress: "true"\n        spec:\n          
containers:\n          - name: zek-01\n            image: 
docker.artifactory.prod.adnxs.net/ssp-object-sync:0.87\n            env:\n      
      - name: GROUP\n              value: "1"\n            - name: ENV\n        
      value: "TEST"\n            - name: SCRIPT\n              value: 
"ADX-CREATIVE-REGISTER"\n            - name: "PARAM_api_user"\n              
valueFrom:\n                secretKeyRef:\n                  name: 
ssp-object-sync-secrets-production\n                  key: "PARAM_api_user"\n   
         - name: "PARAM_api_passwd"\n              valueFrom:\n                
secretKeyRef:\n                  name: ssp-o
 bject-sync-secrets-production\n                  key: "PARAM_api_passwd"\n     
       - name: "PARAM_cron_db_user"\n              valueFrom:\n                
secretKeyRef:\n                  name: ssp-object-sync-secrets-production\n     
             key: "PARAM_cron_db_user"\n            - name: 
"PARAM_cron_db_passwd"\n              valueFrom:\n                
secretKeyRef:\n                  name: ssp-object-sync-secrets-production\n     
             key: "PARAM_cron_db_passwd"\n            - name: 
"PARAM_int_db_user"\n              valueFrom:\n                secretKeyRef:\n  
                name: ssp-object-sync-secrets-production\n                  
key: "PARAM_int_db_user"\n            - name: "PARAM_int_db_passwd"\n           
   valueFrom:\n                secretKeyRef:\n                  name: 
ssp-object-sync-secrets-production\n                  key: 
"PARAM_int_db_passwd"\n            resources:\n              limits:\n          
      cpu: "1"\n                memory: "2.5Gi
 "\n              requests:\n                cpu: "0.1"\n                
memory: "1Gi"\n            volumeMounts:\n            - mountPath: 
/var/log/adnexus\n              name: log-volume\n          volumes:\n          
- name: log-volume\n            emptyDir: {}\n          restartPolicy: Never'
   [2021-09-03 20:22:23,804] {taskinstance.py:1512} INFO - Marking task as 
FAILED. dag_id=aaa_sample_dag_2, task_id=noop_zek_1, 
execution_date=20210903T202220, start_date=20210903T202223, 
end_date=20210903T202223
   [2021-09-03 20:22:23,854] {local_task_job.py:151} INFO - Task exited with 
return code 1
   [2021-09-03 20:22:23,875] {local_task_job.py:261} INFO - 0 downstream tasks 
scheduled from follow-on schedule check
   ```
   
   
   ### What you expected to happen
   
   This is the code that's running: -- I would expect 
`config_file='kubernetes_configs/foo.yaml'` to 
   generate the correct filepath and for the KubePodOperator to consume the 
generated *filepath*, not the generated *filepath contents*.
   
   ```
   
   args = {
       'owner': 'sup_team',
       'start_date': days_ago(1)
   }
   
   dag = DAG(dag_id = 'aaa_sample_dag_2', default_args=args, 
schedule_interval=None, template_searchpath="/opt/airflow/")
    
   with dag:
       kube_operator_1 = KubernetesPodOperator(
           task_id= 'noop_zek_1',
           name='zekk',
           namespace='default',
           cmds=['echo'],
           in_cluster=False,
           do_xcom_push=False,
           config_file='kubernetes_configs/foo.yaml'
       )
       kube_operator_1
   ```
   
   
   ### How to reproduce
   
   You can reproduce with simple docker-compose up with following 
docker-compose.yaml file -- just make sure the final result has a 
`kubernetes_config` folder with a `foo.yaml` file inside:
   
   # Licensed to the Apache Software Foundation (ASF) under one
   # or more contributor license agreements.  See the NOTICE file
   # distributed with this work for additional information
   # regarding copyright ownership.  The ASF licenses this file
   # to you under the Apache License, Version 2.0 (the
   # "License"); you may not use this file except in compliance
   # with the License.  You may obtain a copy of the License at
   #
   #   http://www.apache.org/licenses/LICENSE-2.0
   #
   # Unless required by applicable law or agreed to in writing,
   # software distributed under the License is distributed on an
   # "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
   # KIND, either express or implied.  See the License for the
   # specific language governing permissions and limitations
   # under the License.
   #
   
   # Basic Airflow cluster configuration for CeleryExecutor with Redis and 
PostgreSQL.
   #
   # WARNING: This configuration is for local development. Do not use it in a 
production deployment.
   #
   # This configuration supports basic configuration using environment 
variables or an .env file
   # The following variables are supported:
   #
   # AIRFLOW_IMAGE_NAME           - Docker image name used to run Airflow.
   #                                Default: apache/airflow:master-python3.8
   # AIRFLOW_UID                  - User ID in Airflow containers
   #                                Default: 50000
   # AIRFLOW_GID                  - Group ID in Airflow containers
   #                                Default: 0
   #
   # Those configurations are useful mostly in case of standalone 
testing/running Airflow in test/try-out mode
   #
   # _AIRFLOW_WWW_USER_USERNAME   - Username for the administrator account (if 
requested).
   #                                Default: airflow
   # _AIRFLOW_WWW_USER_PASSWORD   - Password for the administrator account (if 
requested).
   #                                Default: airflow
   # _PIP_ADDITIONAL_REQUIREMENTS - Additional PIP requirements to add when 
starting all containers.
   #                                Default: ''
   #
   # Feel free to modify this file to suit your needs.
   ---
   version: '3'
   x-airflow-common:
     &airflow-common
     # In order to add custom dependencies or upgrade provider packages you can 
use your extended image.
     # Comment the image line, place your Dockerfile in the directory where you 
placed the docker-compose.yaml
     # and uncomment the "build" line below, Then run `docker-compose build` to 
build the images.
     image: ${AIRFLOW_IMAGE_NAME:-apache/airflow:2.1.3}
     # build: .
     environment:
       &airflow-common-env
       AIRFLOW__CORE__EXECUTOR: CeleryExecutor
       AIRFLOW__CORE__SQL_ALCHEMY_CONN: 
postgresql+psycopg2://airflow:airflow@postgres/airflow
       AIRFLOW__CELERY__RESULT_BACKEND: 
db+postgresql://airflow:airflow@postgres/airflow
       AIRFLOW__CELERY__BROKER_URL: redis://:@redis:6379/0
       AIRFLOW__CORE__FERNET_KEY: ''
       AIRFLOW__CORE__DAGS_ARE_PAUSED_AT_CREATION: 'true'
       AIRFLOW__CORE__LOAD_EXAMPLES: 'true'
       AIRFLOW__API__AUTH_BACKEND: 'airflow.api.auth.backend.basic_auth'
       _PIP_ADDITIONAL_REQUIREMENTS: ${_PIP_ADDITIONAL_REQUIREMENTS:-}
     volumes:
       - ./dags:/opt/airflow/dags
       - ./logs:/opt/airflow/logs
       - ./plugins:/opt/airflow/plugins
       - ./kubernetes_configs:/opt/airflow/kubernetes_configs
       - ./airflow.cfg:/opt/airflow/airflow.cfg
     user: "${AIRFLOW_UID:-50000}:${AIRFLOW_GID:-0}"
     depends_on:
       redis:
         condition: service_healthy
       postgres:
         condition: service_healthy
   
   services:
     postgres:
       image: postgres:13
       deploy:
         resources:
             limits:
               cpus: 2
               memory: 2096M
             reservations:
               cpus: 1
               memory: 1048M
       environment:
         POSTGRES_USER: airflow
         POSTGRES_PASSWORD: airflow
         POSTGRES_DB: airflow
       volumes:
         - postgres-db-volume:/var/lib/postgresql/data
       healthcheck:
         test: ["CMD", "pg_isready", "-U", "airflow"]
         interval: 5s
         retries: 5
       restart: always
   
     redis:
       image: redis:latest
       deploy:
         resources:
             limits:
               cpus: 2
               memory: 2096M
             reservations:
               cpus: 1
               memory: 1048M
       ports:
         - 6379:6379
       healthcheck:
         test: ["CMD", "redis-cli", "ping"]
         interval: 5s
         timeout: 30s
         retries: 50
       restart: always
   
     airflow-webserver:
       <<: *airflow-common
       command: webserver
       deploy:
         resources:
             limits:
               cpus: 2
               memory: 2096M
             reservations:
               cpus: 1
               memory: 1048M
       ports:
         - 8080:8080
       healthcheck:
         test: ["CMD", "curl", "--fail", "http://localhost:8080/health";]
         interval: 10s
         timeout: 10s
         retries: 5
       restart: always
   
     airflow-scheduler:
       <<: *airflow-common
       command: scheduler
       healthcheck:
         test: ["CMD-SHELL", 'airflow jobs check --job-type SchedulerJob 
--hostname "$${HOSTNAME}"']
         interval: 10s
         timeout: 10s
         retries: 5
       restart: always
   
     airflow-worker:
       <<: *airflow-common
       command: celery worker
       healthcheck:
         test:
           - "CMD-SHELL"
           - 'celery --app airflow.executors.celery_executor.app inspect ping 
-d "celery@$${HOSTNAME}"'
         interval: 10s
         timeout: 10s
         retries: 5
       restart: always
   
     airflow-init:
       <<: *airflow-common
       entrypoint: /bin/bash
       command:
         - -c
         - |
           function ver() {
             printf "%04d%04d%04d%04d" $${1//./ }
           }
           airflow_version=$$(gosu airflow airflow version)
           airflow_version_comparable=$$(ver $${airflow_version})
           min_airflow_version=2.1.0
           min_airlfow_version_comparable=$$(ver $${min_airflow_version})
           if (( airflow_version_comparable < min_airlfow_version_comparable 
)); then
             echo -e "\033[1;31mERROR!!!: Too old Airflow version 
$${airflow_version}!\e[0m"
             echo "The minimum Airflow version supported: 
$${min_airflow_version}. Only use this or higher!"
             exit 1
           fi
           if [[ -z "${AIRFLOW_UID}" ]]; then
             echo -e "\033[1;31mERROR!!!: AIRFLOW_UID not set!\e[0m"
             echo "Please follow these instructions to set AIRFLOW_UID and 
AIRFLOW_GID environment variables:
               
https://airflow.apache.org/docs/apache-airflow/stable/start/docker.html#initializing-environment";
             exit 1
           fi
           one_meg=1048576
           mem_available=$$(($$(getconf _PHYS_PAGES) * $$(getconf PAGE_SIZE) / 
one_meg))
           cpus_available=$$(grep -cE 'cpu[0-9]+' /proc/stat)
           disk_available=$$(df / | tail -1 | awk '{print $$4}')
           warning_resources="false"
           if (( mem_available < 4000 )) ; then
             echo -e "\033[1;33mWARNING!!!: Not enough memory available for 
Docker.\e[0m"
             echo "At least 4GB of memory required. You have $$(numfmt --to iec 
$$((mem_available * one_meg)))"
             warning_resources="true"
           fi
           if (( cpus_available < 2 )); then
             echo -e "\033[1;33mWARNING!!!: Not enough CPUS available for 
Docker.\e[0m"
             echo "At least 2 CPUs recommended. You have $${cpus_available}"
             warning_resources="true"
           fi
           if (( disk_available < one_meg * 10 )); then
             echo -e "\033[1;33mWARNING!!!: Not enough Disk space available for 
Docker.\e[0m"
             echo "At least 10 GBs recommended. You have $$(numfmt --to iec 
$$((disk_available * 1024 )))"
             warning_resources="true"
           fi
           if [[ $${warning_resources} == "true" ]]; then
             echo
             echo -e "\033[1;33mWARNING!!!: You have not enough resources to 
run Airflow (see above)!\e[0m"
             echo "Please follow the instructions to increase amount of 
resources available:"
             echo "   
https://airflow.apache.org/docs/apache-airflow/stable/start/docker.html#before-you-begin";
           fi
           mkdir -p /sources/logs /sources/dags /sources/plugins
           chown -R "${AIRFLOW_UID}:${AIRFLOW_GID}" /sources/{logs,dags,plugins}
           exec /entrypoint airflow version
       environment:
         <<: *airflow-common-env
         _AIRFLOW_DB_UPGRADE: 'true'
         _AIRFLOW_WWW_USER_CREATE: 'true'
         _AIRFLOW_WWW_USER_USERNAME: ${_AIRFLOW_WWW_USER_USERNAME:-airflow}
         _AIRFLOW_WWW_USER_PASSWORD: ${_AIRFLOW_WWW_USER_PASSWORD:-airflow}
       user: "0:${AIRFLOW_GID:-0}"
       volumes:
         - .:/sources
   
     flower:
       <<: *airflow-common
       command: celery flower
       ports:
         - 5555:5555
       healthcheck:
         test: ["CMD", "curl", "--fail", "http://localhost:5555/";]
         interval: 10s
         timeout: 10s
         retries: 5
       restart: always
   
   volumes:
     postgres-db-volume:
   
   
   ### Anything else
   
   Problem occurs every time.
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to