pierrejeambrun commented on code in PR #58814:
URL: https://github.com/apache/airflow/pull/58814#discussion_r2580606231


##########
airflow-core/src/airflow/config_templates/config.yml:
##########
@@ -712,6 +712,20 @@ database:
       type: integer
       example: ~
       default: "10000"
+    metadata_indexes:
+      description: |
+        JSON list of additional indexes to create on the metadata database at 
API server startup.
+
+        Each item must be a string specifying the table and one or more 
columns:
+        - "table(column1, column2, ...)"
+
+        Existing indexes are detected and skipped. On PostgreSQL, indexes are 
created
+        CONCURRENTLY to avoid locking tables. Other databases attempt 
non-blocking creation
+        where supported, otherwise fallback to standard index creation.
+      version_added: 3.2.0
+      type: string
+      example: "task_instance(dag_id, task_id, 
run_id)|log(dttm)|dag_run(dag_id, run_id)"

Review Comment:
   > Maybe just expending the doc with common use cases of why people would 
want to add specific indexes to the db and how to identify those bottlenecks is 
enough as a starter
   
   Maybe we could add a mechanism (middleware) that logs a warning everytime a 
`list entities endpoint` takes longer than for instance 2 seconds. Extract the 
`order_by` from the query and explain that an index could probably help there.
   
   Actually that can be a nice follow up PR.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to