tanelk commented on code in PR #23846:
URL: https://github.com/apache/airflow/pull/23846#discussion_r890276199
##########
airflow/jobs/scheduler_job.py:
##########
@@ -664,7 +663,20 @@ def _process_executor_events(self, session: Session =
None) -> int:
ti.pid,
)
- if ti.try_number == buffer_key.try_number and ti.state ==
State.QUEUED:
+ # There are two scenarios why the same TI with the same try_number
is queued
+ # after executor is finished with it:
+ # 1) the TI was killed externally and it had no time to mark
itself failed
+ # - in this case we should mark it as failed here.
+ # 2) the TI has been requeued after getting deferred - in this
case either our executor has it
+ # or the TI is queued by another job. Either ways we should not
fail it.
+
+ # All of this could also happen if the state is "running",
+ # but that is handled by the zombie detection.
+
+ ti_queued = ti.try_number == buffer_key.try_number and ti.state ==
TaskInstanceState.QUEUED
+ ti_requeued = ti.queued_by_job_id != self.id or
self.executor.has_task(ti)
Review Comment:
I did consider this but I don't think it adds any value. In the case of fast
triger it will be "recent" even on the "first go".
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]