philipherrmann commented on issue #12111: URL: https://github.com/apache/airflow/issues/12111#issuecomment-734829587
I think this is because of the `reattach_on_restart`idea. The parameter is documented as "if the scheduler dies while the pod is running, reattach and monitor", but, I think, "while the pod is running" seems to checked in an incorrect manner. It seems that the Operator searches the pod's metadata for an "already_checked" label with value True. This label seems to be set only in the "call stack" ``` handle_pod_overlap > monitor_launched_pod > final_state != State.SUCCESS ``` This might be intended, in terms of idempotency - unclear to me. At least it unexpected to me. Possible workarounds are setting `is_delete_operator_pod = True` or `reattach_on_restart = False` when construction the Operator. While both settings allow retries, they have different side effects. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
