sai3563 opened a new issue, #26448: URL: https://github.com/apache/airflow/issues/26448
### Apache Airflow version 2.3.4 ### What happened Hello, I've come across something that I think is most likely a bug in airflow. After I setup remote logging to google cloud storage in airflow, I noticed that whenever airflow tried to save logs to the cloud, I would get this error: `google.api_core.exceptions.NotFound: 404 GET https://storage.googleapis.com/download/storage/v1/b/airflow_logs/o/var%2Flog%2Fairflow%2Fdag_id%3Dtest_dag%2Frun_id%3Dscheduled__2022-09-16T16%3A34%3A00%2B00%3A00%2Ftask_id%3Dbranch_decision%2Fattempt%3D1.log?alt=media: No such object: airflow_logs/var/log/airflow/dag_id=test_dag/run_id=scheduled__2022-09-16T16:34:00+00:00/task_id=branch_decision/attempt=1.log: ('Request failed with status code', 404, 'Expected one of', <HTTPStatus.OK: 200>, <HTTPStatus.PARTIAL_CONTENT: 206>)` But on google cloud storage, logs were being saved just fine. On analyzing the relevant source code, I found this ``` def gcs_write(self, log, remote_log_location): """ Writes the log to the remote_log_location. Fails silently if no log was created. :param log: the log to write to the remote_log_location :type log: str :param remote_log_location: the log's location in remote storage :type remote_log_location: str (path) """ try: blob = storage.Blob.from_string(remote_log_location, self.client) old_log = blob.download_as_bytes().decode() log = '\n'.join([old_log, log]) if old_log else log except Exception as e: # pylint: disable=broad-except if not hasattr(e, 'resp') or e.resp.get('status') != '404': # pylint: disable=no-member log = f'*** Previous log discarded: {str(e)}\n\n' + log self.log.info("Previous log discarded: %s", e) try: blob = storage.Blob.from_string(remote_log_location, self.client) blob.upload_from_string(log, content_type="text/plain") except Exception as e: # pylint: disable=broad-except self.log.error('Could not write logs to %s: %s', remote_log_location, e) ``` The `gcs_write` function is used to write logs, but for some reason, before writing it tried to read in a try. So of course, since it's trying to read the file first before writing. It is unable to find it. So it goes to the except. There it checks whether the attribute `resp` doesn't exists for the exception, which it doesn't. These are the attributes it has ```['__annotations__', '__cause__', '__class__', '__context__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__setstate__', '__sizeof__', '__str__', '__subclasshook__', '__suppress_context__', '__traceback__', '__weakref__', '_details', '_error_info', '_errors', '_response', 'args', 'code', 'details', 'domain', 'errors', 'grpc_status_code', 'message', 'metadata', 'reason', 'response', 'with_traceback']``` So the if condition criteria is met and an error is thrown. As soon as that part is done, it's going to the write part and successfully saving it. Weird. ### What you think should happen instead The error should not be thrown, as logs are being successfully written in Google Cloud Storage. ### How to reproduce Enable remote logging in airflow and run any which generates logs and check airflow worker logs. ### Operating System Ubuntu 20.04 LTS ### Versions of Apache Airflow Providers apache-airflow-providers-amazon==5.1.0 apache-airflow-providers-celery==3.0.0 apache-airflow-providers-common-sql==1.2.0 apache-airflow-providers-ftp==3.1.0 apache-airflow-providers-google==8.3.0 <-- This is the relevant library apache-airflow-providers-http==4.0.0 apache-airflow-providers-imap==3.0.0 apache-airflow-providers-mongo==3.0.0 apache-airflow-providers-mysql==3.2.0 apache-airflow-providers-slack==5.1.0 apache-airflow-providers-sqlite==3.2.1 ### Deployment Virtualenv installation ### Deployment details _No response_ ### Anything else _No response_ ### Are you willing to submit PR? - [X] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
