tuzonghua commented on PR #62424: URL: https://github.com/apache/airflow/pull/62424#issuecomment-3954642257
@SameerMesiah97 I haven't run the entire test suite yet but can do so now. My motivation for this change was actually not in `GCSHook` but instead in `GCSDeleteObjectsOperator`, where I wanted to give a list of objects to delete without first needing to check that they all exist (that is, to ignore the `NotFound` error). In an earlier draft of this work, instead of catching the error and logging like I'm doing now, I added a boolean to both `GCSDeleteObjectsOperator` and `GCSHook.delete` that would suppress the error by passing a no-op lambda as the `on_error` callback in the Google client `Bucket.delete_blobs` method (docs [here](https://docs.cloud.google.com/python/docs/reference/storage/latest/google.cloud.storage.bucket.Bucket#google_cloud_storage_bucket_Bucket_delete_blobs)). So something like: ```python def delete(self, bucket_name: str, object_name: str, ignore_error: bool = False) -> None: on_error = None if ignore_error: on_error = lambda bool: None # Swallow the NotFound error client = self.get_conn() bucket = client.bucket(bucket_name) blob = bucket.blob(blob_name=object_name) bucket.delete_blobs([blob], on_error=on_error) # instead of Blob.delete ``` In this way, the existing behavior of `GCSHook.delete` doesn't change. However, it would preclude logging which blobs don't exist when the boolean is set to `True` since the error would never be raised. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
