potiuk commented on code in PR #28256:
URL: https://github.com/apache/airflow/pull/28256#discussion_r1057903720


##########
airflow/dag_processing/manager.py:
##########
@@ -777,8 +777,9 @@ def clear_nonexistent_import_errors(self, session):
         :param session: session for ORM operations
         """
         query = session.query(errors.ImportError)
-        if self._file_paths:
-            query = 
query.filter(~errors.ImportError.filename.in_(self._file_paths))
+        files = list_py_file_paths(self._dag_directory, 
include_examples=False, include_zip_paths=True)

Review Comment:
   Do we have an idea how to do it better? I have not looked at all the 
processing paths that are involved, but I am not sure how "expensive" this is - 
seems to be done once per loop and something that we have to do anyway 
periodically (as we cannot really rely on inotify kind of API due to various 
filesystems DAG folder can be on.
   
   Maybe this check should happen independently on another theread in-parallel 
to parsing ? Any other ideas @ephraimbuddy ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to