What version of Airflow are you running? While it may not be related directly to your problem there were some big performance improvements merged in to Airflow 1.10.7 (I’m not sure how it works with the LocalExecutor but spawning new processes to run tasks could be what is slowing down the scheduling): https://github.com/apache/airflow/pull/6627
Investigation described in detail here by Ash: https://www.astronomer.io/blog/profiling-the-airflow-scheduler/ Damian From: Reed Villanueva <[email protected]> Sent: Tuesday, January 28, 2020 22:23 To: [email protected] Subject: Re: Airflow becomes very slow after some time for large DAG Possibly related: I modified the DAG definition file while the dag was running. This appeared to cause no problems and modify the dag as desired (I remove a single task from the dag). But thought it may be useful to mention, since IDK what kind of interactions this could have caused under the hood. Though even after the idle COMMIT query go away, the dag appears slow. On Tue, Jan 28, 2020 at 5:16 PM Reed Villanueva <[email protected]<mailto:[email protected]>> wrote: Have a large dag (100-200 tasks) [image.png] that seems to run very slow after some time. Eg. starts fast, but after running for some time notice that it starts getting slower (tasks that are very simple seem to take longer and scheduler seems to slow down). No experience with postgresql (which is what I am using for backend (with LocalExecutor mode)). But I suspect the scheduler is involved somehow (since a bash/python scripted version of the process the airflow DAG was migrated from runs much faster) and was able to get this info: airflow=> select state, count(*) from pg_stat_activity where datname = 'airflow' group by 1; state | count --------+------- active | 1 idle | 10 (2 rows) and could see some stats like... airflow=> SELECT * FROM pg_stat_activity; datid | datname | pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | xact_start | query_start | state_change | waiting | state | query -------+---------+--------+----------+---------+------------------+-------------+-----------------+-------------+-------------------------------+--------- ----------------------+-------------------------------+-------------------------------+---------+--------+--------------------------------- 16384 | airflow | 18014 | 16385 | airflow | | 127.0.0.1 | | 58888 | 2020-01-28 16:39:24.696152-10 | | 2020-01-28 16:45:19.086447-10 | 2020-01-28 16:45:19.08834-10 | f | idle | COMMIT . . . . Anyone with more experience have ideas what could be commonly be happening here? Any further debugging advice (note I really have limited experience w/ psql at the moment)? Anything I could check / query that would give more useful info? This electronic message is intended only for the named recipient, and may contain information that is confidential or privileged. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution or use of the contents of this message is strictly prohibited. If you have received this message in error or are not the named recipient, please notify us immediately by contacting the sender at the electronic mail address noted above, and delete and destroy all copies of this message. Thank you. =============================================================================== Please access the attached hyperlink for an important electronic communications disclaimer: http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html ===============================================================================
