dstandish commented on PR #31622: URL: https://github.com/apache/airflow/pull/31622#issuecomment-1793124398
@vincbeck can you share an example traceback that caused you to add tenacity to `consume_logs`? I am just digging into KPO logging a bit again, and finding that the whole thing has become extremely messy and complicated, and I'm trying to chip away at some of the mess. And as part of that, the behavior of consume_logs is such that it is already called in a loop if there's an error. So it's a bit odd to also wrap it with tenacity, a kind of mixing of two different retry strategies that makes it a bit confusing. Now what can be done.... Well please observe that `consume_logs` calls `read_pod_logs`, which I believe is where the APIException that you catch is actually encountered, and which already retries using tenacity. So my thought is we can just update that function to handle your case, and thus move the retry logic closer to where it is needed, and reduce the tenacity wrappers from 2 to 1. But to do this I need more information about specifically where that exception is raised. My suspicion is that it would be raised at `logs = self._client.read_namespaced_pod_log` in `read_pod_logs`. But I recognize it's also possible that it is not raised until the logs consumer is iterated in at `for raw_line in logs:` in `consume_logs`. Thanks for the help. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
