gmsantos opened a new pull request, #23625: URL: https://github.com/apache/airflow/pull/23625
Airflow 2.3.0 was introduced by default the [warn deployment exposure feature](https://github.com/apache/airflow/pull/18557) in the webserver by inspecting the log table for robots events. This query could be costly in deployments with large log tables. Adding an index in the event column improves this query performance. Here is an example of the generated query in a log table with around ~33 million rows: ```sql SELECT count(*) AS count_1 FROM (SELECT log.id AS log_id, log.dttm AS log_dttm, log.dag_id AS log_dag_id, log.task_id AS log_task_id, log.map_index AS log_map_index, log.event AS log_event, log.execution_date AS log_execution_date, log.owner AS log_owner, log.extra AS log_extra FROM log WHERE log.event = 'robots' AND log.dttm > '2022-05-03T15:11:03.381511+00:00'::timestamptz) AS anon_1; ``` Before applying the index: **3.386s** <img width="818" alt="image" src="https://user-images.githubusercontent.com/1991286/167680317-3d7f6f3b-3e77-4b40-945f-f4ecae2d9798.png"> After applying the index: **0.109s** <img width="816" alt="image" src="https://user-images.githubusercontent.com/1991286/167680386-92457c5e-e007-4703-8e0f-9df1a1dc56bb.png"> Also, the execution plan in a PostgreSQL database: **Before** <img width="684" alt="image" src="https://user-images.githubusercontent.com/1991286/167680799-fe9fc1bb-b712-4b7b-b382-0916a0420f75.png"> **After** <img width="569" alt="image" src="https://user-images.githubusercontent.com/1991286/167680862-6ce485db-d8b5-4c94-9a8f-81df3e7e9b3e.png"> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
