mfridrikhson commented on issue #13692: URL: https://github.com/apache/airflow/issues/13692#issuecomment-1138530242
We had the same issue with retrieving task logs but from what I've been able to find out it looks like it is not related to any connection issues with the actual log provider. Our specific case of this issue was the following: we got a task failure and the task instance was marked `failed` in the DAG view. When you go to the task view you could see 2 log tabs: first with the `Failed to fetch log file` message and second with a successful run and all the logs. I went to check the log folder and what I found out is that there was only logs for the second tab (task try):  _(Ignore the 3rd log file - it's from a later rerun after the incident I'm describing had happened)_ The actual cause why the UI renders such an error (at least in our case) is that it renders `max_try_number` number of tabs and queries the contents for each one by its index. It happened to be that there is no file for tab #1 and thus it shown this error message. So my next guess was that for some reason task instance's `try_number` got increased one extra time. And that's why I find @doowhtron's [comment](https://github.com/apache/airflow/issues/13692#issuecomment-800085669) important here - maybe some kind of race condition happens which causes the task to execute twice or something. Some notes about our case and environment: - The issue happens intermittently (usually everything works fine) - It is not happening to some specific task or operator (both sensors and default tasks had this issue) - It is not happening specifically when the cluster is experiencing high load or the opposite, so I don't think it has something to do with the performance - We use Airflow v2.2.4 - We store logs on S3 - We run a **single** scheduler -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
