shohamy7 commented on code in PR #36882:
URL: https://github.com/apache/airflow/pull/36882#discussion_r1463452920
##########
airflow/providers/cncf/kubernetes/executors/kubernetes_executor.py:
##########
@@ -434,9 +434,9 @@ def sync(self) -> None:
)
self.fail(task[0], e)
except ApiException as e:
- # These codes indicate something is wrong with pod
definition; otherwise we assume pod
- # definition is ok, and that retrying may work
- if e.status in (400, 422):
+ # In case of the below error codes, fail the task and
honor the task retires.
+ # Otherwise, go for continuous/infinite retries.
+ if e.status in (400, 403, 404, 422):
Review Comment:
Maybe this can be configurable via `airflow.cfg` under the
`kubernetes_executor` and setting the default as the current implementation?
I agree with @hussein-awala that this is a breaking (or significant) change,
but I also understand the problem of relying on the `KubernetesExecutor` to
infinite retry on such errors.
I am not sure what you are thinking about adding more configuration to the
kubernetes executor, but maybe it may be a good settlement by letting the users
to decide how they want the executor will deal with this use case.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]