potiuk commented on code in PR #25935:
URL: https://github.com/apache/airflow/pull/25935#discussion_r961628298


##########
airflow/dag_processing/manager.py:
##########
@@ -663,6 +675,12 @@ def _fetch_callbacks(self, max_callbacks: int, session: 
Session = NEW_SESSION):
         with prohibit_commit(session) as guard:
             query = (
                 session.query(DbCallbackRequest)
+                .filter(
+                    or_(
+                        DbCallbackRequest.dag_directory == 
DagProcessorDirectory.get_dag_directory(),
+                        not self.standalone_dag_processor,

Review Comment:
   I read a little about it and it seems that - unlike in many programing 
languages, there is no predetermined ways how optimizer processes boolean 
expressions.
   
   If we could rely on left-to-right order, this would be the right solution:
   
   
   ```
                       or_(
                           not self.standalone_dag_processor,
                           DbCallbackRequest.dag_directory == 
DagProcessorDirectory.get_dag_directory(),
   ```
   
   But it seems we can't.
   
   So I think the cleanest (and most performant) solution will be something 
different:
   
   ```
            query =   session.query(DagModel.dag_id, DagModel.fileloc, 
DagModel.last_parsed_time).filter(DagModel.is_active())
            if self.standalone_dag_processor:
                 query = query.filter(DagModel.processor_subdir == 
self.get_dag_directory())
            dags_parsed = query.all()
   ```
   
   Seems like most pythonic way to solve it. AFAIK, It's perfectly fine to 
chain filters in SQLAlchemy and the result is the same as multiple filter 
entries.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to