vandonr-amz opened a new pull request, #30134: URL: https://github.com/apache/airflow/pull/30134
we've seen that every once in a while, an eks system test will fail with a timeout on the pod starting. Since we delete all resources in the teardown, we're pretty clueless on what's happening, and the failures appear to be random so we cannot reproduce it in a controlled environment. With this change, we'll get at least a pod description if there is a failure, which should help us understand the root cause. This step is run only on failure (`trigger_rule=TriggerRule.ONE_FAILED`), expecting to run after the start_pod operation failed. It is possible that an earlier step would fail and trigger this, but that's how life is. The main point is that we don't want to run it every single time because it makes the test ~1 minute longer (installing awscli, necessary to get the kube config, is quite long). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
