Steexyz opened a new issue, #60535:
URL: https://github.com/apache/airflow/issues/60535

   ### Apache Airflow Provider(s)
   
   google
   
   ### Versions of Apache Airflow Providers
   
   19.3.0
   
   ### Apache Airflow version
   
   3.1.6
   
   ### Operating System
   
   Debian GNU/Linux 12 (bookworm)
   
   ### Deployment
   
   Docker-Compose
   
   ### Deployment details
   
   Deployment was bade using the base docker-compose file.
   ```
   # Licensed to the Apache Software Foundation (ASF) under one
   # or more contributor license agreements.  See the NOTICE file
   # distributed with this work for additional information
   # regarding copyright ownership.  The ASF licenses this file
   # to you under the Apache License, Version 2.0 (the
   # "License"); you may not use this file except in compliance
   # with the License.  You may obtain a copy of the License at
   #
   #   http://www.apache.org/licenses/LICENSE-2.0
   #
   # Unless required by applicable law or agreed to in writing,
   # software distributed under the License is distributed on an
   # "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
   # KIND, either express or implied.  See the License for the
   # specific language governing permissions and limitations
   # under the License.
   #
   
   # Basic Airflow cluster configuration for CeleryExecutor with Redis and 
PostgreSQL.
   #
   # WARNING: This configuration is for local development. Do not use it in a 
production deployment.
   #
   # This configuration supports basic configuration using environment 
variables or an .env file
   # The following variables are supported:
   #
   # AIRFLOW_IMAGE_NAME           - Docker image name used to run Airflow.
   #                                Default: apache/airflow:3.1.6
   # AIRFLOW_UID                  - User ID in Airflow containers
   #                                Default: 50000
   # AIRFLOW_PROJ_DIR             - Base path to which all the files will be 
volumed.
   #                                Default: .
   # Those configurations are useful mostly in case of standalone 
testing/running Airflow in test/try-out mode
   #
   # _AIRFLOW_WWW_USER_USERNAME   - Username for the administrator account (if 
requested).
   #                                Default: airflow
   # _AIRFLOW_WWW_USER_PASSWORD   - Password for the administrator account (if 
requested).
   #                                Default: airflow
   # _PIP_ADDITIONAL_REQUIREMENTS - Additional PIP requirements to add when 
starting all containers.
   #                                Use this option ONLY for quick checks. 
Installing requirements at container
   #                                startup is done EVERY TIME the service is 
started.
   #                                A better way is to build a custom image or 
extend the official image
   #                                as described in 
https://airflow.apache.org/docs/docker-stack/build.html.
   #                                Default: ''
   #
   # Feel free to modify this file to suit your needs.
   ---
   x-airflow-common:
     &airflow-common
     # In order to add custom dependencies or upgrade provider distributions 
you can use your extended image.
     # Comment the image line, place your Dockerfile in the directory where you 
placed the docker-compose.yaml
     # and uncomment the "build" line below, Then run `docker-compose build` to 
build the images.
     image: custom-airflow-3:3.1.6
     # build: .
     environment:
       &airflow-common-env
       AIRFLOW__CORE__EXECUTOR: CeleryExecutor
       AIRFLOW__CORE__AUTH_MANAGER: 
airflow.providers.fab.auth_manager.fab_auth_manager.FabAuthManager
       AIRFLOW__DATABASE__SQL_ALCHEMY_CONN: 
postgresql+psycopg2://airflow:airflow@postgres/airflow
       AIRFLOW__CORE__FERNET_KEY: '8_4t1Ld5YV7sS3k9n-Bv2pXgZc1R0fHwJmNlUa6I7oQ='
       AIRFLOW__CORE__DAGS_ARE_PAUSED_AT_CREATION: 'true'
       AIRFLOW__CELERY__RESULT_BACKEND: 
db+postgresql://airflow:airflow@postgres/airflow
       AIRFLOW__CELERY__BROKER_URL: redis://:@redis:6379/0
       AIRFLOW__CORE__LOAD_EXAMPLES: 'false'
       AIRFLOW__CORE__EXECUTION_API_SERVER_URL: 
'http://airflow-apiserver:8080/execution/'
       # yamllint disable rule:line-length
       # Use simple http server on scheduler for health checks
       # See 
https://airflow.apache.org/docs/apache-airflow/stable/administration-and-deployment/logging-monitoring/check-health.html#scheduler-health-check-server
       # yamllint enable rule:line-length
       AIRFLOW__SCHEDULER__ENABLE_HEALTH_CHECK: 'true'
       # WARNING: Use _PIP_ADDITIONAL_REQUIREMENTS option ONLY for a quick 
checks
       # The following line can be used to set a custom config file, stored in 
the local config folder
       AIRFLOW_CONFIG: '/opt/airflow/config/airflow.cfg'
     volumes:
       - ./dags:/opt/airflow/dags
       - ./logs:/opt/airflow/logs
       - ./plugins:/opt/airflow/plugins
       - ./config:/opt/airflow/config
       - ${appdata}/gcloud:/home/airflow/.config/gcloud
     user: "${AIRFLOW_UID:-50000}:0"
     depends_on:
       &airflow-common-depends-on
       redis:
         condition: service_healthy
       postgres:
         condition: service_healthy
   
   services:
     postgres:
       image: postgres:16
       environment:
         POSTGRES_USER: airflow
         POSTGRES_PASSWORD: airflow
         POSTGRES_DB: airflow
       volumes:
         - postgres-db-volume:/var/lib/postgresql/data
       healthcheck:
         test: ["CMD", "pg_isready", "-U", "airflow"]
         interval: 10s
         retries: 5
         start_period: 5s
       restart: always
     redis:
       # Redis is limited to 7.2-bookworm due to licencing change
       # https://redis.io/blog/redis-adopts-dual-source-available-licensing/
       image: redis:7.2-bookworm
       expose:
         - 6379
       healthcheck:
         test: ["CMD", "redis-cli", "ping"]
         interval: 10s
         timeout: 30s
         retries: 50
         start_period: 30s
       restart: always
   
     airflow-apiserver:
       <<: *airflow-common
       command: api-server
       ports:
         - "8080:8080"
       healthcheck:
         test: ["CMD", "curl", "--fail", "http://localhost:8080/api/v2/version";]
         interval: 30s
         timeout: 10s
         retries: 5
         start_period: 30s
       restart: always
       depends_on:
         <<: *airflow-common-depends-on
         airflow-init:
           condition: service_completed_successfully
   
     airflow-scheduler:
       <<: *airflow-common
       command: scheduler
       healthcheck:
         test: ["CMD", "curl", "--fail", "http://localhost:8974/health";]
         interval: 30s
         timeout: 10s
         retries: 5
         start_period: 30s
       restart: always
       depends_on:
         <<: *airflow-common-depends-on
         airflow-init:
           condition: service_completed_successfully
   
     airflow-dag-processor:
       <<: *airflow-common
       command: dag-processor
       healthcheck:
         test: ["CMD-SHELL", 'airflow jobs check --job-type DagProcessorJob 
--hostname "$${HOSTNAME}"']
         interval: 30s
         timeout: 10s
         retries: 5
         start_period: 30s
       restart: always
       depends_on:
         <<: *airflow-common-depends-on
         airflow-init:
           condition: service_completed_successfully
   
     airflow-triggerer:
       <<: *airflow-common
       command: triggerer
       healthcheck:
         test: ["CMD-SHELL", 'airflow jobs check --job-type TriggererJob 
--hostname "$${HOSTNAME}"']
         interval: 30s
         timeout: 10s
         retries: 5
         start_period: 30s
       restart: always
       depends_on:
         <<: *airflow-common-depends-on
         airflow-init:
           condition: service_completed_successfully
           
     airflow-worker:
       <<: *airflow-common
       command: celery worker
       healthcheck:
         # yamllint disable rule:line-length
         test:
           - "CMD-SHELL"
           - 'celery --app 
airflow.providers.celery.executors.celery_executor.app inspect ping -d 
"celery@$${HOSTNAME}" || celery --app airflow.executors.celery_executor.app 
inspect ping -d "celery@$${HOSTNAME}"'
         interval: 30s
         timeout: 10s
         retries: 5
         start_period: 30s
       environment:
         <<: *airflow-common-env
         # Required to handle warm shutdown of the celery workers properly
         # See 
https://airflow.apache.org/docs/docker-stack/entrypoint.html#signal-propagation
         DUMB_INIT_SETSID: "0"
       restart: always
       depends_on:
         <<: *airflow-common-depends-on
         airflow-apiserver:
           condition: service_healthy
         airflow-init:
           condition: service_completed_successfully
   
     airflow-init:
       <<: *airflow-common
       entrypoint: /bin/bash
       # yamllint disable rule:line-length
       command:
         - -c
         - |
           if [[ -z "50000" ]]; then
             echo
             echo -e "\033[1;33mWARNING!!!: AIRFLOW_UID not set!\e[0m"
             echo "If you are on Linux, you SHOULD follow the instructions 
below to set "
             echo "AIRFLOW_UID environment variable, otherwise files will be 
owned by root."
             echo "For other operating systems you can get rid of the warning 
with manually created .env file:"
             echo "    See: 
https://airflow.apache.org/docs/apache-airflow/stable/howto/docker-compose/index.html#setting-the-right-airflow-user";
             echo
             export AIRFLOW_UID=$$(id -u)
           fi
           one_meg=1048576
           mem_available=$$(($$(getconf _PHYS_PAGES) * $$(getconf PAGE_SIZE) / 
one_meg))
           cpus_available=$$(grep -cE 'cpu[0-9]+' /proc/stat)
           disk_available=$$(df / | tail -1 | awk '{print $$4}')
           warning_resources="false"
           if (( mem_available < 4000 )) ; then
             echo
             echo -e "\033[1;33mWARNING!!!: Not enough memory available for 
Docker.\e[0m"
             echo "At least 4GB of memory required. You have $$(numfmt --to iec 
$$((mem_available * one_meg)))"
             echo
             warning_resources="true"
           fi
           if (( cpus_available < 2 )); then
             echo
             echo -e "\033[1;33mWARNING!!!: Not enough CPUS available for 
Docker.\e[0m"
             echo "At least 2 CPUs recommended. You have $${cpus_available}"
             echo
             warning_resources="true"
           fi
           if (( disk_available < one_meg * 10 )); then
             echo
             echo -e "\033[1;33mWARNING!!!: Not enough Disk space available for 
Docker.\e[0m"
             echo "At least 10 GBs recommended. You have $$(numfmt --to iec 
$$((disk_available * 1024 )))"
             echo
             warning_resources="true"
           fi
           if [[ $${warning_resources} == "true" ]]; then
             echo
             echo -e "\033[1;33mWARNING!!!: You have not enough resources to 
run Airflow (see above)!\e[0m"
             echo "Please follow the instructions to increase amount of 
resources available:"
             echo "   
https://airflow.apache.org/docs/apache-airflow/stable/howto/docker-compose/index.html#before-you-begin";
             echo
           fi
           echo
           echo "Creating missing opt dirs if missing:"
           echo
           mkdir -v -p /opt/airflow/{logs,dags,plugins,config}
           echo
           echo "Airflow version:"
           /entrypoint airflow version
           echo
           echo "Files in shared volumes:"
           echo
           ls -la /opt/airflow/{logs,dags,plugins,config}
           echo
           echo "Running airflow config list to create default config file if 
missing."
           echo
           /entrypoint airflow config list >/dev/null
           echo
           echo "Files in shared volumes:"
           echo
           ls -la /opt/airflow/{logs,dags,plugins,config}
           echo
           echo "Change ownership of files in /opt/airflow to 50000:0"
           echo
           chown -R "50000:0" /opt/airflow/
           echo
           echo "Change ownership of files in shared volumes to 50000:0"
           echo
           chown -v -R "50000:0" /opt/airflow/{logs,dags,plugins,config}
           echo
           echo "Files in shared volumes:"
           echo
           ls -la /opt/airflow/{logs,dags,plugins,config}
   
       # yamllint enable rule:line-length
       environment:
         <<: *airflow-common-env
         _AIRFLOW_DB_MIGRATE: 'true'
         _AIRFLOW_WWW_USER_CREATE: 'true'
         _AIRFLOW_WWW_USER_USERNAME: ${_AIRFLOW_WWW_USER_USERNAME:-airflow}
         _AIRFLOW_WWW_USER_PASSWORD: ${_AIRFLOW_WWW_USER_PASSWORD:-airflow}
         _PIP_ADDITIONAL_REQUIREMENTS: ''
       user: "0:0"
   
     airflow-cli:
       <<: *airflow-common
       profiles:
         - debug
       environment:
         <<: *airflow-common-env
         CONNECTION_CHECK_MAX_COUNT: "0"
       # Workaround for entrypoint issue. See: 
https://github.com/apache/airflow/issues/16252
       command:
         - bash
         - -c
         - airflow
       depends_on:
         <<: *airflow-common-depends-on
   
   volumes:
     postgres-db-volume:
   
   ``` 
   A custom image has been made with additionnal requirements
   
   ```
   apache-airflow-providers-fab
   apache-airflow-providers-celery
   connexion[swagger-ui]
   apache-airflow-providers-jdbc
   apache-airflow-providers-oracle[common.sql]
   apache-airflow-providers-snowflake
   apache-airflow-providers-docker
   airflow-provider-rabbitmq
   apache-airflow-providers-google
   google-cloud-storage
   pandas
   zeep
   xmltodict
   orjson
   pyodbc
   ``` 
   Docker file
   
   ```dockerfile
   FROM apache/airflow:3.1.6-python3.12
   
   ARG AIRFLOW_VERSION=3.1.6
   ARG PYTHON_VERSION=3.12
   ARG 
CONSTRAINT_URL="https://raw.githubusercontent.com/apache/airflow/constraints-${AIRFLOW_VERSION}/constraints-${PYTHON_VERSION}.txt";
   
   
   # Switch to the root user to install system-level dependencies
   USER root
   
   # Install system dependencies, including OpenJDK for JDBC
   RUN apt-get update && \
       apt-get install -y --no-install-recommends curl gnupg telnet 
openjdk-17-jdk && \
       curl https://packages.microsoft.com/keys/microsoft.asc | gpg --dearmor > 
/etc/apt/trusted.gpg.d/microsoft.gpg && \
       curl https://packages.microsoft.com/config/debian/12/prod.list > 
/etc/apt/sources.list.d/mssql-release.list && \
       apt-get update && \
       ACCEPT_EULA=Y apt-get install -y --no-install-recommends msodbcsql18 
mssql-tools18 unixodbc-dev && \
       apt-get clean && \
       rm -rf /var/lib/apt/lists/*
   
   # Set JAVA_HOME environment variable
   ENV JAVA_HOME=/usr/lib/jvm/java-17-openjdk-amd64
   
   # Set the container's timezone to UTC
   ENV TZ=UTC
   ENV PATH="$PATH:/opt/mssql-tools18/bin"
   
   # Allow legacy TLS versions in Java's security configuration
   RUN sed -i 's/TLSv1, TLSv1.1, //g' 
/usr/lib/jvm/java-17-openjdk-amd64/conf/security/java.security
   
   # Create directory for JDBC drivers and download the driver
   RUN mkdir -p /opt/airflow/drivers && \
       curl -o /opt/airflow/drivers/mssql-jdbc.jar 
https://repo1.maven.org/maven2/com/microsoft/sqlserver/mssql-jdbc/12.4.2.jre11/mssql-jdbc-12.4.2.jre11.jar
 && \
       curl -o /opt/airflow/drivers/ojdbc8-12.2.0.1.jar 
https://repo1.maven.org/maven2/com/oracle/database/jdbc/ojdbc8/12.2.0.1/ojdbc8-12.2.0.1.jar
   
   COPY --chown=airflow:airflow openedge.jar /opt/airflow/drivers/openedge.jar
   # Switch back to the non-privileged airflow user
   USER airflow
   
   # Install Python packages
   COPY requirements.txt .
   RUN pip install -r requirements.txt \
       --constraint "${CONSTRAINT_URL}"
   ``` 
   
   ### What happened
   
   I am not able to retrieve a Google Cloud connection anymore.
   
   ```python
   from airflow.providers.google.common.hooks.base_google import GoogleBaseHook
   ``` 
   
   ```python
   hook = GoogleBaseHook(gcp_conn_id=self.connection_id)
   ``` 
   
   Produce the following error:
   
   ```
   Fichier 
"/home/airflow/.local/lib/python3.12/site-packages/airflow/sdk/execution_time/task_runner.py",
 ligne 1004 dans run
   
   Fichier 
"/home/airflow/.local/lib/python3.12/site-packages/airflow/sdk/execution_time/task_runner.py",
 ligne 1405 dans _execute_task
   
   Fichier 
"/home/airflow/.local/lib/python3.12/site-packages/airflow/sdk/bases/operator.py",
 ligne 417 dans wrapper
   
   Fichier "/opt/airflow/plugins/operators/extraction.py", ligne 100 dans 
execute
   
   Fichier "/opt/airflow/plugins/operators/extraction.py", ligne 87 dans execute
   
   Fichier 
"/opt/airflow/plugins/connectors/cloud/target/gcs_target_connector.py", ligne 
35 dans __enter__
   
   Fichier 
"/home/airflow/.local/lib/python3.12/site-packages/airflow/providers/google/common/hooks/base_google.py",
 ligne 283 dans __init__
   
   Fichier 
"/home/airflow/.local/lib/python3.12/site-packages/airflow/sdk/bases/hook.py", 
ligne 61 dans get_connection
   
   Fichier 
"/home/airflow/.local/lib/python3.12/site-packages/airflow/sdk/definitions/connection.py",
 ligne 225 dans get
   
   Fichier 
"/home/airflow/.local/lib/python3.12/site-packages/airflow/sdk/execution_time/context.py",
 ligne 174 dans _get_connection
   ``` 
   The connection can successfully be retrieved from CLI using the following 
command:
   
   ```
   airflow connections list
   
   10 | gcs-bucket-project | google_cloud_platform | None        | None         
     | None               | None              | None               | None | 
False        | False             | {'project':        | google-cloud-platform
   
   ``` 
   
   
   ### What you think should happen instead
   
   I should be able to retrieve the connection like previously in Airflow 
2.11.0. I haven't seen any documentation that would prove otherwise but I'm may 
have not stumbled upon it yet.
   
   ### How to reproduce
   
   Try to instantiate a hook with a Google Platform connection using 
Application Default Credential (ADC)
   
   ### Anything else
   
   This may or may not be an issue and be related to changes in Airflow 3 but 
my JDBC connections are not suffering from the same issue using 
airflow.hooks.base.BaseHook. I've also tried using BaseHook to retry the 
connections informations without success.
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [x] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to