Re: Disable Processing of DAG file

2018-05-30 Thread ramandumcs
Thanks Maxime, we have 100(s) of dags with schedule set to @once with new DAGs keep on coming in the system. Scheduler process each and every DAG inside the local DAG folder. Each Dag file processing takes around 400 millisecond and we have set max_threads to 8(As we have 8 core machine). i.e

Re: Disable Processing of DAG file

2018-05-30 Thread Maxime Beauchemin
The TLDR of how the processor works is: while True: * sets a multiprocessing queue with N processes (say 32) * main process looks for the list of all .py files in DAGS_FOLDER * fills in the queue with all .py * each one of the 32 suprocess opens a file and interprets it (it's insulated from the

Re: Disable Processing of DAG file

2018-05-29 Thread Ruiqin Yang
Hi folks, This config line controls how often the scheduler scan the DAG folder and tries to discover/ forget DAGs. For doing dag file processing part, scheduler does parse the DAG file

Re: Disable Processing of DAG file

2018-05-28 Thread Ananth Durai
It is an interesting question. On a slightly related note, Correct me if I'm wrong, AFAIK we require restarting airflow scheduler in order pick any new DAG file changes by the scheduler. In that case, should the scheduler do the DAGFileProcessing every time before scheduling the tasks? Regards,

Disable Processing of DAG file

2018-05-28 Thread ramandumcs
Hi All, We have a use case where there would be 100(s) of DAG files with schedule set to "@once". Currently it seems that scheduler processes each and every file and creates a Dag Object. Is there a way or config to tell scheduler to stop processing certain files. Thanks, Raman Gupta