potiuk commented on PR #28900:
URL: https://github.com/apache/airflow/pull/28900#issuecomment-1465238532

   > @potiuk Should we also reflect methods defined in 
`airflow/models/taskinstance.py` in `airflow/models/pydantic/taskinstance.py`? 
Tests are complaining about it.
   
   Yes. How I envision it (@mhenc - would love your feedback here) - we should 
generally separate out TaskInstance "object"  methods to a separate regular 
Python functions taking `TaskInstance | TaskInstancePydantic` parameter (same 
for the other objects).  basically extracting them out from the Object of the 
ORM to be just plain python function (can be in task_instance_module, no 
problem with it).  And we can have TaskInstance's and TaskInstancePydantic 
methods just call the methods with passing 'self" parameter. Those will be the 
methods that would make sense to be used in jinja templates or in 
TaskFlow/Regular operators via Context should also be added to the Pydantic 
dataclass and calll those functions - because users might want to call those 
methods without changing their dags.
   
   See also comment in 
https://github.com/apache/airflow/discussions/29831#discussioncomment-5237871 
wher
   
   Most likely not all those methods can be implemented this way (thoe which 
access DB directly) and this is expected that the users might have to conver 
their JINJA usages in those cases, but for most parts, simple "utility" 
functions that operate on plain field values of the objects should be 
unaffected. If we see a method that makes no sense to be used on "Pydantic" 
object we should not touch it and simply make sure it is only called in the 
context where we know ORM model is used (i.e. basically on the server side of 
the internal API calls).
   
   This approach has several advantages:
   * the changes in code all over the places in airflow where the methods are 
used will be minimal
   * in most "reasonable" use cases where task instance and other objects are 
used in JINJA or via Context, they will **just work**
   * in cases where such access would require DB access call, the user will get 
MyPy check that they should not do it (because  such method will be missing in 
Context's `TaskInstance | TaskInstancePydantic` type - and the user will be 
able to decide whether they can ignore/cast such object safely (when they did 
not enable DB Internal API). I tihnk it's gentle "invitation" to refactoring of 
your DAG code.
   
   WDYT?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to