potiuk commented on code in PR #38155:
URL: https://github.com/apache/airflow/pull/38155#discussion_r1557289295
##########
airflow/models/taskinstance.py:
##########
@@ -1409,7 +1410,9 @@ class TaskInstance(Base, LoggingMixin):
cascade="all, delete, delete-orphan",
)
note = association_proxy("task_instance_note", "content",
creator=_creator_note)
+
task: Operator | None = None
+ _thread_local_data = threading.local()
Review Comment:
Yes - thread local variables shoud should be stored in global variable in
"task_instance" in this case (private).
Yes - I know it's not perfect and also a bit hacky - I am not sure if there
is a better way though.
I'd love other opinions on that one. Maybe my reservations on adding to
TaskInstance were just too cautious. @mobuchowski and @ashb - maybe as the
authors of listenrs you have some opinion:
Some more context: as originally explained by @vandonr :
> I was writing a listener, and it seems extremely hard to get some
information on the error in the on_task_instance_failed callback, because the
error is not passed as a parameter to the callback itself 😞
> It's [written to the
context](https://github.com/apache/airflow/blob/9e97433dc3368138431305c5161a007e4fc5f227/airflow/models/taskinstance.py#L2809-L2810)
a bit further down, but we don't have that yet when the callback is called.
> We cannot add an extra parameter now because it'd be a breaking change,
but what do you think about storing the error in the TaskInstance object before
calling on_task_instance_failed ? It'd be a pretty cheap way to solve that
issue (if it's one!). We don't need to persist that in DB or anything, it just
needs to carry the value to the method call that just follows, seems simple
enough ?
> We cannot add an extra parameter now because it'd be a breaking change,
but what do you think about storing the error in the TaskInstance object before
calling on_task_instance_failed ? It'd be a pretty cheap way to solve that
issue (if it's one!). We don't need to persist that in DB or anything, it just
needs to carry the value to the method call that just follows, seems simple
enough ?
My point is that we should not make it a "hack" but simply extend the API to
include the error message.
Originally Raphael proposed to add error to the TaskInstance object - seems
simple, but feels hacky as we are modifying the object that is a data model and
adding dynamically error message to it. Plus Task Instance gets serialized
back/forth in some places so that seems a bit out-of-place.
I proposed to use ThreadLocal - but that also ends up hacky where the error
message will be stored in private global variable in `task_instance.py` - and
accessible by `get_last_error_mesage()`. This also feels hacky. But I have no
other idea how to pass error message to listener in this case. Maybe you could
help?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]