GitHub user Steexyz created a discussion: The conn_id `xxxxxxx` isn't defined Google Cloud Platform
### Apache Airflow Provider(s) google ### Versions of Apache Airflow Providers 19.3.0 ### Apache Airflow version 3.1.6 ### Operating System Debian GNU/Linux 12 (bookworm) ### Deployment Docker-Compose ### Deployment details Deployment was bade using the base docker-compose file. ``` # Licensed to the Apache Software Foundation (ASF) under one # or more contributor license agreements. See the NOTICE file # distributed with this work for additional information # regarding copyright ownership. The ASF licenses this file # to you under the Apache License, Version 2.0 (the # "License"); you may not use this file except in compliance # with the License. You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, # software distributed under the License is distributed on an # "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY # KIND, either express or implied. See the License for the # specific language governing permissions and limitations # under the License. # # Basic Airflow cluster configuration for CeleryExecutor with Redis and PostgreSQL. # # WARNING: This configuration is for local development. Do not use it in a production deployment. # # This configuration supports basic configuration using environment variables or an .env file # The following variables are supported: # # AIRFLOW_IMAGE_NAME - Docker image name used to run Airflow. # Default: apache/airflow:3.1.6 # AIRFLOW_UID - User ID in Airflow containers # Default: 50000 # AIRFLOW_PROJ_DIR - Base path to which all the files will be volumed. # Default: . # Those configurations are useful mostly in case of standalone testing/running Airflow in test/try-out mode # # _AIRFLOW_WWW_USER_USERNAME - Username for the administrator account (if requested). # Default: airflow # _AIRFLOW_WWW_USER_PASSWORD - Password for the administrator account (if requested). # Default: airflow # _PIP_ADDITIONAL_REQUIREMENTS - Additional PIP requirements to add when starting all containers. # Use this option ONLY for quick checks. Installing requirements at container # startup is done EVERY TIME the service is started. # A better way is to build a custom image or extend the official image # as described in https://airflow.apache.org/docs/docker-stack/build.html. # Default: '' # # Feel free to modify this file to suit your needs. --- x-airflow-common: &airflow-common # In order to add custom dependencies or upgrade provider distributions you can use your extended image. # Comment the image line, place your Dockerfile in the directory where you placed the docker-compose.yaml # and uncomment the "build" line below, Then run `docker-compose build` to build the images. image: custom-airflow-3:3.1.6 # build: . environment: &airflow-common-env AIRFLOW__CORE__EXECUTOR: CeleryExecutor AIRFLOW__CORE__AUTH_MANAGER: airflow.providers.fab.auth_manager.fab_auth_manager.FabAuthManager AIRFLOW__DATABASE__SQL_ALCHEMY_CONN: postgresql+psycopg2://airflow:airflow@postgres/airflow AIRFLOW__CORE__FERNET_KEY: '8_4t1Ld5YV7sS3k9n-Bv2pXgZc1R0fHwJmNlUa6I7oQ=' AIRFLOW__CORE__DAGS_ARE_PAUSED_AT_CREATION: 'true' AIRFLOW__CELERY__RESULT_BACKEND: db+postgresql://airflow:airflow@postgres/airflow AIRFLOW__CELERY__BROKER_URL: redis://:@redis:6379/0 AIRFLOW__CORE__LOAD_EXAMPLES: 'false' AIRFLOW__CORE__EXECUTION_API_SERVER_URL: 'http://airflow-apiserver:8080/execution/' # yamllint disable rule:line-length # Use simple http server on scheduler for health checks # See https://airflow.apache.org/docs/apache-airflow/stable/administration-and-deployment/logging-monitoring/check-health.html#scheduler-health-check-server # yamllint enable rule:line-length AIRFLOW__SCHEDULER__ENABLE_HEALTH_CHECK: 'true' # WARNING: Use _PIP_ADDITIONAL_REQUIREMENTS option ONLY for a quick checks # The following line can be used to set a custom config file, stored in the local config folder AIRFLOW_CONFIG: '/opt/airflow/config/airflow.cfg' volumes: - ./dags:/opt/airflow/dags - ./logs:/opt/airflow/logs - ./plugins:/opt/airflow/plugins - ./config:/opt/airflow/config - ${appdata}/gcloud:/home/airflow/.config/gcloud user: "${AIRFLOW_UID:-50000}:0" depends_on: &airflow-common-depends-on redis: condition: service_healthy postgres: condition: service_healthy services: postgres: image: postgres:16 environment: POSTGRES_USER: airflow POSTGRES_PASSWORD: airflow POSTGRES_DB: airflow volumes: - postgres-db-volume:/var/lib/postgresql/data healthcheck: test: ["CMD", "pg_isready", "-U", "airflow"] interval: 10s retries: 5 start_period: 5s restart: always redis: # Redis is limited to 7.2-bookworm due to licencing change # https://redis.io/blog/redis-adopts-dual-source-available-licensing/ image: redis:7.2-bookworm expose: - 6379 healthcheck: test: ["CMD", "redis-cli", "ping"] interval: 10s timeout: 30s retries: 50 start_period: 30s restart: always airflow-apiserver: <<: *airflow-common command: api-server ports: - "8080:8080" healthcheck: test: ["CMD", "curl", "--fail", "http://localhost:8080/api/v2/version"] interval: 30s timeout: 10s retries: 5 start_period: 30s restart: always depends_on: <<: *airflow-common-depends-on airflow-init: condition: service_completed_successfully airflow-scheduler: <<: *airflow-common command: scheduler healthcheck: test: ["CMD", "curl", "--fail", "http://localhost:8974/health"] interval: 30s timeout: 10s retries: 5 start_period: 30s restart: always depends_on: <<: *airflow-common-depends-on airflow-init: condition: service_completed_successfully airflow-dag-processor: <<: *airflow-common command: dag-processor healthcheck: test: ["CMD-SHELL", 'airflow jobs check --job-type DagProcessorJob --hostname "$${HOSTNAME}"'] interval: 30s timeout: 10s retries: 5 start_period: 30s restart: always depends_on: <<: *airflow-common-depends-on airflow-init: condition: service_completed_successfully airflow-triggerer: <<: *airflow-common command: triggerer healthcheck: test: ["CMD-SHELL", 'airflow jobs check --job-type TriggererJob --hostname "$${HOSTNAME}"'] interval: 30s timeout: 10s retries: 5 start_period: 30s restart: always depends_on: <<: *airflow-common-depends-on airflow-init: condition: service_completed_successfully airflow-worker: <<: *airflow-common command: celery worker healthcheck: # yamllint disable rule:line-length test: - "CMD-SHELL" - 'celery --app airflow.providers.celery.executors.celery_executor.app inspect ping -d "celery@$${HOSTNAME}" || celery --app airflow.executors.celery_executor.app inspect ping -d "celery@$${HOSTNAME}"' interval: 30s timeout: 10s retries: 5 start_period: 30s environment: <<: *airflow-common-env # Required to handle warm shutdown of the celery workers properly # See https://airflow.apache.org/docs/docker-stack/entrypoint.html#signal-propagation DUMB_INIT_SETSID: "0" restart: always depends_on: <<: *airflow-common-depends-on airflow-apiserver: condition: service_healthy airflow-init: condition: service_completed_successfully airflow-init: <<: *airflow-common entrypoint: /bin/bash # yamllint disable rule:line-length command: - -c - | if [[ -z "50000" ]]; then echo echo -e "\033[1;33mWARNING!!!: AIRFLOW_UID not set!\e[0m" echo "If you are on Linux, you SHOULD follow the instructions below to set " echo "AIRFLOW_UID environment variable, otherwise files will be owned by root." echo "For other operating systems you can get rid of the warning with manually created .env file:" echo " See: https://airflow.apache.org/docs/apache-airflow/stable/howto/docker-compose/index.html#setting-the-right-airflow-user" echo export AIRFLOW_UID=$$(id -u) fi one_meg=1048576 mem_available=$$(($$(getconf _PHYS_PAGES) * $$(getconf PAGE_SIZE) / one_meg)) cpus_available=$$(grep -cE 'cpu[0-9]+' /proc/stat) disk_available=$$(df / | tail -1 | awk '{print $$4}') warning_resources="false" if (( mem_available < 4000 )) ; then echo echo -e "\033[1;33mWARNING!!!: Not enough memory available for Docker.\e[0m" echo "At least 4GB of memory required. You have $$(numfmt --to iec $$((mem_available * one_meg)))" echo warning_resources="true" fi if (( cpus_available < 2 )); then echo echo -e "\033[1;33mWARNING!!!: Not enough CPUS available for Docker.\e[0m" echo "At least 2 CPUs recommended. You have $${cpus_available}" echo warning_resources="true" fi if (( disk_available < one_meg * 10 )); then echo echo -e "\033[1;33mWARNING!!!: Not enough Disk space available for Docker.\e[0m" echo "At least 10 GBs recommended. You have $$(numfmt --to iec $$((disk_available * 1024 )))" echo warning_resources="true" fi if [[ $${warning_resources} == "true" ]]; then echo echo -e "\033[1;33mWARNING!!!: You have not enough resources to run Airflow (see above)!\e[0m" echo "Please follow the instructions to increase amount of resources available:" echo " https://airflow.apache.org/docs/apache-airflow/stable/howto/docker-compose/index.html#before-you-begin" echo fi echo echo "Creating missing opt dirs if missing:" echo mkdir -v -p /opt/airflow/{logs,dags,plugins,config} echo echo "Airflow version:" /entrypoint airflow version echo echo "Files in shared volumes:" echo ls -la /opt/airflow/{logs,dags,plugins,config} echo echo "Running airflow config list to create default config file if missing." echo /entrypoint airflow config list >/dev/null echo echo "Files in shared volumes:" echo ls -la /opt/airflow/{logs,dags,plugins,config} echo echo "Change ownership of files in /opt/airflow to 50000:0" echo chown -R "50000:0" /opt/airflow/ echo echo "Change ownership of files in shared volumes to 50000:0" echo chown -v -R "50000:0" /opt/airflow/{logs,dags,plugins,config} echo echo "Files in shared volumes:" echo ls -la /opt/airflow/{logs,dags,plugins,config} # yamllint enable rule:line-length environment: <<: *airflow-common-env _AIRFLOW_DB_MIGRATE: 'true' _AIRFLOW_WWW_USER_CREATE: 'true' _AIRFLOW_WWW_USER_USERNAME: ${_AIRFLOW_WWW_USER_USERNAME:-airflow} _AIRFLOW_WWW_USER_PASSWORD: ${_AIRFLOW_WWW_USER_PASSWORD:-airflow} _PIP_ADDITIONAL_REQUIREMENTS: '' user: "0:0" airflow-cli: <<: *airflow-common profiles: - debug environment: <<: *airflow-common-env CONNECTION_CHECK_MAX_COUNT: "0" # Workaround for entrypoint issue. See: https://github.com/apache/airflow/issues/16252 command: - bash - -c - airflow depends_on: <<: *airflow-common-depends-on volumes: postgres-db-volume: ``` A custom image has been made with additionnal requirements ``` apache-airflow-providers-fab apache-airflow-providers-celery connexion[swagger-ui] apache-airflow-providers-jdbc apache-airflow-providers-oracle[common.sql] apache-airflow-providers-snowflake apache-airflow-providers-docker airflow-provider-rabbitmq apache-airflow-providers-google google-cloud-storage pandas zeep xmltodict orjson pyodbc ``` Docker file ```dockerfile FROM apache/airflow:3.1.6-python3.12 ARG AIRFLOW_VERSION=3.1.6 ARG PYTHON_VERSION=3.12 ARG CONSTRAINT_URL="https://raw.githubusercontent.com/apache/airflow/constraints-${AIRFLOW_VERSION}/constraints-${PYTHON_VERSION}.txt" # Switch to the root user to install system-level dependencies USER root # Install system dependencies, including OpenJDK for JDBC RUN apt-get update && \ apt-get install -y --no-install-recommends curl gnupg telnet openjdk-17-jdk && \ curl https://packages.microsoft.com/keys/microsoft.asc | gpg --dearmor > /etc/apt/trusted.gpg.d/microsoft.gpg && \ curl https://packages.microsoft.com/config/debian/12/prod.list > /etc/apt/sources.list.d/mssql-release.list && \ apt-get update && \ ACCEPT_EULA=Y apt-get install -y --no-install-recommends msodbcsql18 mssql-tools18 unixodbc-dev && \ apt-get clean && \ rm -rf /var/lib/apt/lists/* # Set JAVA_HOME environment variable ENV JAVA_HOME=/usr/lib/jvm/java-17-openjdk-amd64 # Set the container's timezone to UTC ENV TZ=UTC ENV PATH="$PATH:/opt/mssql-tools18/bin" # Allow legacy TLS versions in Java's security configuration RUN sed -i 's/TLSv1, TLSv1.1, //g' /usr/lib/jvm/java-17-openjdk-amd64/conf/security/java.security # Create directory for JDBC drivers and download the driver RUN mkdir -p /opt/airflow/drivers && \ curl -o /opt/airflow/drivers/mssql-jdbc.jar https://repo1.maven.org/maven2/com/microsoft/sqlserver/mssql-jdbc/12.4.2.jre11/mssql-jdbc-12.4.2.jre11.jar && \ curl -o /opt/airflow/drivers/ojdbc8-12.2.0.1.jar https://repo1.maven.org/maven2/com/oracle/database/jdbc/ojdbc8/12.2.0.1/ojdbc8-12.2.0.1.jar COPY --chown=airflow:airflow openedge.jar /opt/airflow/drivers/openedge.jar # Switch back to the non-privileged airflow user USER airflow # Install Python packages COPY requirements.txt . RUN pip install -r requirements.txt \ --constraint "${CONSTRAINT_URL}" ``` ### What happened I am not able to retrieve a Google Cloud connection anymore. ```python from airflow.providers.google.common.hooks.base_google import GoogleBaseHook ``` ```python hook = GoogleBaseHook(gcp_conn_id=self.connection_id) ``` Produce the following error: ``` Fichier "/home/airflow/.local/lib/python3.12/site-packages/airflow/sdk/execution_time/task_runner.py", ligne 1004 dans run Fichier "/home/airflow/.local/lib/python3.12/site-packages/airflow/sdk/execution_time/task_runner.py", ligne 1405 dans _execute_task Fichier "/home/airflow/.local/lib/python3.12/site-packages/airflow/sdk/bases/operator.py", ligne 417 dans wrapper Fichier "/opt/airflow/plugins/operators/extraction.py", ligne 100 dans execute Fichier "/opt/airflow/plugins/operators/extraction.py", ligne 87 dans execute Fichier "/opt/airflow/plugins/connectors/cloud/target/gcs_target_connector.py", ligne 35 dans __enter__ Fichier "/home/airflow/.local/lib/python3.12/site-packages/airflow/providers/google/common/hooks/base_google.py", ligne 283 dans __init__ Fichier "/home/airflow/.local/lib/python3.12/site-packages/airflow/sdk/bases/hook.py", ligne 61 dans get_connection Fichier "/home/airflow/.local/lib/python3.12/site-packages/airflow/sdk/definitions/connection.py", ligne 225 dans get Fichier "/home/airflow/.local/lib/python3.12/site-packages/airflow/sdk/execution_time/context.py", ligne 174 dans _get_connection ``` The connection can successfully be retrieved from CLI using the following command: ``` airflow connections list 10 | gcs-bucket-project | google_cloud_platform | None | None | None | None | None | None | False | False | {'project': | google-cloud-platform ``` ### What you think should happen instead I should be able to retrieve the connection like previously in Airflow 2.11.0. I haven't seen any documentation that would prove otherwise but I'm may have not stumbled upon it yet. ### How to reproduce Try to instantiate a hook with a Google Platform connection using Application Default Credential (ADC) ### Anything else This may or may not be an issue and be related to changes in Airflow 3 but my JDBC connections are not suffering from the same issue using airflow.hooks.base.BaseHook. I've also tried using BaseHook to retry the connections informations without success. ### Are you willing to submit PR? - [ ] Yes I am willing to submit a PR! ### Code of Conduct - [x] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md) GitHub link: https://github.com/apache/airflow/discussions/61094 ---- This is an automatically sent email for [email protected]. To unsubscribe, please send an email to: [email protected]
