Cache them where? When would it get invalidated? Given the DAG parsing happens in a sub-process how would the cache live longer than that process?
I think the change might be to use a per-process/per-thread SQLA connection when parsing dags, so that if a DAG needs access to the metadata DB it does it with just one connection rather than N. -ash > On 22 Oct 2018, at 11:11, Sai Phanindhra <phani8...@gmail.com> wrote: > > Who don't we cache variables? We can fairly assume that variables won't get > changed very frequently(not as frequent as scheduler DAG run time). We can > keep default timeout to few times scheduler run time. This will help > control number of connections to database and reduces load both on > scheduler and database. > > On Mon 22 Oct, 2018, 13:34 Marcin Szymański, <ms32...@gmail.com> wrote: > >> Hi >> >> You are right, it's a sure way to saturate db connections, as a connection >> is established every few seconds when the DAGs are parsed. The same happens >> when you use variables in __init__ of an operator. Os environment variable >> would be safer for your need. >> >> Marcin >> >> >> On Mon, 22 Oct 2018, 08:34 Pramiti Goel, <pramitigoe...@gmail.com> wrote: >> >>> Hi, >>> >>> We want to make owner and email Id general, so we don't want to put in >>> airflow dag. Using variables will help us in changing the email/owner >>> later, if there are lot of dags of same owner. >>> >>> For example: >>> >>> >>> default_args = { >>> 'owner': Variable.get('test_owner_de'), >>> 'depends_on_past': False, >>> 'start_date': datetime(2018, 10, 17), >>> 'email': Variable.get('de_infra_email'), >>> 'email_on_failure': True, >>> 'email_on_retry': True, >>> 'retries': 2, >>> 'retry_delay': timedelta(minutes=1)} >>> >>> >>> Looking into the code of Airflow, it is making connection session >> everytime >>> the variable is created, and then close it. (Let me know if I understand >>> wrong). If there are many dags with variables in default args running >>> parallel, querying variable table in MySQL, will it have any sort of >>> limitation on number of sessions of SQLAlchemy ? Will that make dag slow >> as >>> there will be many queries to mysql for each dag? is the above approach >>> good ? >>> >>>> using Airlfow 1.9 >>> >>> Thanks, >>> Pramiti. >>> >>