[GitHub] [airflow] sweco edited a comment on issue #19329: Kubernetes Executor: non-existent upstream failed

GitBox Tue, 09 Nov 2021 00:43:18 -0800


sweco edited a comment on issue #19329:
URL: https://github.com/apache/airflow/issues/19329#issuecomment-963226239



   Hey there, we're observing the same issue on Airflow 2.1.4 in a big DAG 
having hundreds of tasks. By looking in the scheduler logs from the moment when 
the downstream task was marked as `upstream_failed` it seems that it happened 
when a pod with a task could not be started.
   
   However, the question is, why is the task put in a failed state instead of 
retrying? And if this is normal, why does not the downstream wait for all the 
retries to be finished?
   
   ```
   Pod "pod_name" has been pending for longer than 300 seconds. It will be 
deleted and set to failed.
   Event: pod_name had an event of type MODIFIED
   Event: pod_name Pending
   Event: pod_name had an event of type DELETED
   Event: Failed to start pod pod_name
   Attempting to finish pod; pod_id: pod_name; state: failed; annotations: 
{..., 'try_number': 1}
   Changing state of task_instance, <TaskInstanceState.FAILED: 'failed'>, 
'pod_name', 'airflow', '...') to failed
   Executor reports execution of task_id execution_date=execution_date exited 
with status failed for try_number 1
   Executor reports task instance <TaskInstance: task_id execution_date 
[queued]> finished (failed) although the task says its queued. (Info: None) Was 
the task killed externally?
   Setting task instance <TaskInstance: task_name execution_date [queued]> 
state to failed as reported by executor
   
   # Later on
   
   1 tasks up for execution:
   <task_id execution_date [scheduled]>
   ```
   
   Also, looking at the successful task, it indeed has two tries, from which 
the first one really finished at the incriminated timestamp.
   
   
![image](https://user-images.githubusercontent.com/11132999/140761583-527a3754-a51d-410a-90b0-e596ab0e8c23.png)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [airflow] sweco edited a comment on issue #19329: Kubernetes Executor: non-existent upstream failed

Reply via email to