tuzonghua commented on PR #62424:
URL: https://github.com/apache/airflow/pull/62424#issuecomment-3954642257

   @SameerMesiah97 I haven't run the entire test suite yet but can do so now.
   
   My motivation for this change was actually not in `GCSHook` but instead in 
`GCSDeleteObjectsOperator`, where I wanted to give a list of objects to delete 
without first needing to check that they all exist (that is, to ignore the 
`NotFound` error). In an earlier draft of this work, instead of catching the 
error and logging like I'm doing now, I added a boolean to both 
`GCSDeleteObjectsOperator` and `GCSHook.delete` that would suppress the error 
by passing a no-op lambda as the `on_error` callback in the Google client 
`Bucket.delete_blobs` method (docs 
[here](https://docs.cloud.google.com/python/docs/reference/storage/latest/google.cloud.storage.bucket.Bucket#google_cloud_storage_bucket_Bucket_delete_blobs)).
 So something like:
   ```python
   def delete(self, bucket_name: str, object_name: str, ignore_error: bool = 
False) -> None:
       on_error = None
       if ignore_error:
           on_error = lambda bool: None  # Swallow the NotFound error
       
       client = self.get_conn()
       bucket = client.bucket(bucket_name)
       blob = bucket.blob(blob_name=object_name)
       bucket.delete_blobs([blob], on_error=on_error)  # instead of Blob.delete
   ```
   
   In this way, the existing behavior of `GCSHook.delete` doesn't change. 
However, it would preclude logging which blobs don't exist when the boolean is 
set to `True` since the error would never be raised.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to