I think there might be two ways:

1. Setup the connections via. the Airflow UI: 
http://airflow.readthedocs.io/en/latest/configuration.html#connections, I guess 
this could be done in your code also.

2. Put your connection setup into a operator at the begin of your dag

________________________________
发件人: alireza.khoshkbari@ <gmail.com alireza.khoshkb...@gmail.com>
发送时间: 2018年5月16日 1:21
收件人: d...@airflow.apache.org
主题: How Airflow import modules as it executes the tasks

To start off, here is my project structure:
├── dags
│   ├── __init__.py
│   ├── core
│   │   ├── __init__.py
│   │   ├── operators
│   │   │   ├── __init__.py
│   │   │   ├── first_operator.py
│   │   └── util
│   │       ├── __init__.py
│   │       ├── db.py
│   ├── my_dag.py

Here is the versions and details of the airflow docker setup:

In my dag in different tasks I'm connecting to db (not Airflow db). I've setup 
db connection pooling,  I expected that my db.py would be be loaded once across 
the DagRun. However, in the log I can see that each task imports the module and 
new db connections made by each and every task. I can see that db.py is loaded 
in each task by having the line below in db.py:

logging.info("I was loaded {}".format(random.randint(0,100)))

I understand that each operator can technically be run in a separate machine 
and it does make sense that each task runs sort of independently. However, not 
sure that if this does apply in case of using LocalExecutor. Now the question 
is, how I can share the resources (db connections) across tasks using 
LocalExecutor.

Reply via email to