potiuk commented on code in PR #40894:
URL: https://github.com/apache/airflow/pull/40894#discussion_r1685004350


##########
airflow/serialization/pydantic/taskinstance.py:
##########
@@ -458,9 +458,9 @@ def schedule_downstream_tasks(self, session: Session | None 
= None, max_tis_per_
 
         :meta: private
         """
-        return TaskInstance._schedule_downstream_tasks(
-            ti=self, session=session, max_tis_per_query=max_tis_per_query
-        )
+        # we should not schedule downstream tasks with Pydantic model because 
it will not be able to
+        # get the DAG object (we do not serialize it currently).
+        return

Review Comment:
   Yeah. I even attempted that, but the thing is that it is rather useless - 
effectively this one is run in a remote component (internal API) that de snot 
have the DAG object and the only way to get that DAG object is to parse it.  
The whole idea of mini scheduler is that we already have DAG object so that we 
can get the downstream deps - and it would also mean that we have to parse the 
DAG in the internal_api component  - or serialize and parse the DAG object from 
the worker. I'd opt for the latter - because if we start parsing the DAG in the 
internal-api component, this has the potential of the DAG parsing accessing the 
DB directly (internal_api object has access to the DB). Not mentioning that 
most of the benefits of the mini-scheduler  (DAG already loaded in memory) are 
way less
   
   So the easiest way is to just skip mini-scheduler for now. But we can bring 
it back later.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to