argemiront opened a new issue #15495:
URL: https://github.com/apache/airflow/issues/15495


   
   **Apache Airflow version**:
   2.0.2
   
   **Kubernetes version (if you are using kubernetes)** (use `kubectl version`):
   
   **Environment**:
   
   - **Cloud provider or hardware configuration**:
   - **OS** (e.g. from /etc/os-release): official airflow images 
_apache/airflow:2.0.2-python3.8_  and apache/airflow:2.0.2-python3.6
   - **Kernel** (e.g. `uname -a`):
   - **Install tools**: docker
   - **Others**:
   
   **What happened**:
   Using the official 2.0.2 airflow image the 3rd-party packages are not being 
recognized by the DAG parser, firing the  `ModuleNotFoundError` on DAGs and 
plugins. 
   
   As a test, I created a new docker image with one 3rd-party package 
(`surveymonty`). A simple DAG with no tasks imports it. The following error is 
observed:
   
   ```
   Broken DAG: [/usr/src/airflow/dags/simpledag.py] Traceback (most recent call 
last):
     File "<frozen importlib._bootstrap>", line 219, in 
_call_with_frames_removed
     File "/usr/src/airflow/dags/simpledag.py", line 2, in <module>
       import surveymonty
   ModuleNotFoundError: No module named 'surveymonty'
   ```
   I could confirm the package is installed by running `pip freeze` and `python 
simpledag.py` with no errors on both cases.
   
   
   **What you expected to happen**:
   3rd-party packages to be recognized by the parser.
   
   Using a python image and manually installing airflow 2.0.2 and the 
requirements, the issue is not present.
   
   
   **How to reproduce it**:
   Create the following Dockerfile:
   ```
   FROM apache/airflow:2.0.2-python3.8
   
   USER root
   RUN pip3 install surveymonty==0.2.5
   WORKDIR /usr/src/airflow
   ENV AIRFLOW_HOME /usr/src/airflow
   ENV AIRFLOW_GPL_UNIDECODE true
   RUN chown airflow:airflow .
   USER airflow
   COPY ./simpledag.py dags/simpledag.py
   # Airflow webserver
   EXPOSE 8080
   
   ENTRYPOINT []
   ```
   
   simpledag.py:
   ```
   from airflow import DAG
   import surveymonty
   
   dag = DAG('test', schedule_interval="* * * * *")
   ```
   docker-compose.yml:
   ```
   version: '3.8'
   services:
       db:
           image: postgres:9.6
           environment:
               - POSTGRES_USER=airflow
               - POSTGRES_PASSWORD=airflow
               - POSTGRES_DB=airflow
           ports:
               - "5432:5432"
   
       webserver:
           build: .
           restart: always
           depends_on:
               - db
           environment: 
               - AIRFLOW__CORE__LOAD_EXAMPLES=False
               - AIRFLOW__CORE__DAGS_FOLDER=/usr/src/airflow/dags
               - AIRFLOW__CORE__PLUGINS_FOLDER=/usr/src/airflow/plugins
               - 
AIRFLOW__CORE__SQL_ALCHEMY_CONN=postgresql://airflow:airflow@db:5432/airflow
           ports:
               - "8080:8080"
           command: bash -c "airflow initdb && airflow webserver"
   
       scheduler:
           build: .
           restart: always
           depends_on:
               - db
           environment: 
               - AIRFLOW__CORE__LOAD_EXAMPLES=False
               - AIRFLOW__CORE__DAGS_FOLDER=/usr/src/airflow/dags
               - AIRFLOW__CORE__PLUGINS_FOLDER=/usr/src/airflow/plugins
               - 
AIRFLOW__CORE__SQL_ALCHEMY_CONN=postgresql://airflow:airflow@db:5432/airflow
           command: airflow scheduler
   ```
   
   I'm my _production_ case, all 3rd-party libraries imported on plugins and 
DAGs fire the `ModuleNotFoundError` and the latest Google provider has multiple 
failures as well.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to