SamWheating commented on code in PR #22976:
URL: https://github.com/apache/airflow/pull/22976#discussion_r869782418
##########
airflow/executors/kubernetes_executor.py:
##########
@@ -689,9 +689,14 @@ def _change_state(self, key: TaskInstanceKey, state:
Optional[str], pod_id: str,
self.event_buffer[key] = state, None
def try_adopt_task_instances(self, tis: List[TaskInstance]) ->
List[TaskInstance]:
- tis_to_flush = [ti for ti in tis if not ti.queued_by_job_id]
- scheduler_job_ids = {ti.queued_by_job_id for ti in tis}
- pod_ids = {ti.key: ti for ti in tis if ti.queued_by_job_id}
+ scheduler_job_ids = {ti.queued_by_job_id for ti in tis if
ti.queued_by_job_id}
+
+ # Tasks triggered through API will have no ti.queued_by_job_id
+ # and their pod will have label 'airflow-worker=manual'
+ if any(ti for ti in tis if not ti.queued_by_job_id):
Review Comment:
This change will fix the issue of tasks being incorrectly reset, but I think
that in some cases we could still could end up with empty fields in the
database, which might cause issues elsewhere or later on.
Would it be possible instead to just set the `queued_by_job_id` field to
`manual` when the taskInstance is initially queued? We already have to manually
write the `queued_dttm` for similar reasons.
https://github.com/apache/airflow/blob/cfa95af7e83b067787d8d6596caa3bc97f4b25bd/airflow/www/views.py#L1811-L1819
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]