SameerMesiah97 commented on code in PR #62600:
URL: https://github.com/apache/airflow/pull/62600#discussion_r2937384408


##########
airflow-core/src/airflow/dag_processing/manager.py:
##########
@@ -971,6 +972,9 @@ def _create_process(self, dag_file: DagFileInfo) -> 
DagFileProcessorProcess:
         callback_to_execute_for_file = self._callback_to_execute.pop(dag_file, 
[])
         logger, logger_filehandle = self._get_logger_for_dag_file(dag_file)
 
+        # TODO: This is a blocking DB call, we should probably cache this or 
move it elsewhere
+        known_pools = {p.pool for p in Pool.get_pools()}

Review Comment:
   I understand that you mentioned the intention to eventually cache this, but 
I don’t think it is acceptable to introduce a DB call here just to support 
`known_pools`. This should not be waved away behind a `#TODO`.
   
   It looks like this can be avoided by initializing `known_pools` alongside 
the other DagFileProcessorManager attributes and refreshing it periodically 
inside `_run_parsing_loop`. As it stands, this implementation introduces an 
unnecessary metadata DB query every time `_create_process()` is called.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to