potiuk commented on code in PR #38155:
URL: https://github.com/apache/airflow/pull/38155#discussion_r1557289295


##########
airflow/models/taskinstance.py:
##########
@@ -1409,7 +1410,9 @@ class TaskInstance(Base, LoggingMixin):
         cascade="all, delete, delete-orphan",
     )
     note = association_proxy("task_instance_note", "content", 
creator=_creator_note)
+
     task: Operator | None = None
+    _thread_local_data = threading.local()

Review Comment:
   Yes - thread local variables shoud should be stored in global variable in 
"task_instance" in this case (private).
   
   Yes - I know it's not perfect and also a bit hacky  - I am not sure if there 
is a better way though.
   
   I'd love other opinions on that one. Maybe my reservations on  adding to 
TaskInstance were just too cautious. @mobuchowski  and @ashb - maybe as the 
authors of listenrs you have some opinion:
   
   Some more context: as originally explained by @vandonr : 
   
   > I was writing a listener, and it seems extremely hard to get some 
information on the error in the on_task_instance_failed callback, because the 
error is not passed as a parameter to the callback itself 😞
   
   > It's [written to the 
context](https://github.com/apache/airflow/blob/9e97433dc3368138431305c5161a007e4fc5f227/airflow/models/taskinstance.py#L2809-L2810)
 a bit further down, but we don't have that yet when the callback is called.
   
   > We cannot add an extra parameter now because it'd be a breaking change, 
but what do you think about storing the error in the TaskInstance object before 
calling on_task_instance_failed ? It'd be a pretty cheap way to solve that 
issue (if it's one!). We don't need to persist that in DB or anything, it just 
needs to carry the value to the method call that just follows, seems simple 
enough ?
   
   > We cannot add an extra parameter now because it'd be a breaking change, 
but what do you think about storing the error in the TaskInstance object before 
calling on_task_instance_failed ? It'd be a pretty cheap way to solve that 
issue (if it's one!). We don't need to persist that in DB or anything, it just 
needs to carry the value to the method call that just follows, seems simple 
enough ?
   
   My point is that we should not make it a "hack" but simply extend the API to 
include the error message.
   Originally Raphael proposed to add error to the TaskInstance object - seems 
simple, but feels hacky as we are modifying the object that is a data model and 
adding dynamically error message to it. Plus Task Instance gets serialized 
back/forth in some places so that seems a bit out-of-place.
   
   I proposed to use ThreadLocal - but that also ends up hacky where the error 
message will be stored in private global variable in `task_instance.py` - and 
accessible by `get_last_error_mesage()`. This also feels hacky. But I have no 
other idea how to pass error message to listener in this case. Maybe you could 
help?  
   
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to