A-Costa commented on issue #32650:
URL: https://github.com/apache/airflow/issues/32650#issuecomment-1708539407

   Hi @shahar1 i've published a very rudimental draft PR #34137 
   
   The first issue i'm encountering is the following: while it was easy to 
adapt the `poke` method to use the `match_glob` parameter, it is not easy to 
adjust the `execute` method, in particular when the sensor is set with 
`deferrable=True`.
   The problem here is that the sensor is using the `GCSBlobTrigger`, which in 
turn uses the 
[`Bucket`](https://github.com/talkiq/gcloud-aio/blob/b170cf916f0ad52357bd3457ce5b905dd9170132/storage/gcloud/aio/storage/bucket.py#L30)
 class from the [gcloud-aio](https://github.com/talkiq/gcloud-aio) library.
   
   The `GCSBlobTrigger` is using the method `blob_exists` which only works with 
the exact filename.
   
   
https://github.com/apache/airflow/blob/b9acffa81bf61dcf0c5553942c52629c7f75ebe2/airflow/providers/google/cloud/triggers/gcs.py#L101
   
   Another method called 
[`list_blobs`](https://github.com/talkiq/gcloud-aio/blob/b170cf916f0ad52357bd3457ce5b905dd9170132/storage/gcloud/aio/storage/bucket.py#L64)
 is available in the class and is in fact used by `GCSPrefixBlobTrigger`. The 
problem is that `list_blob` only implements the `prefix` parameter and not a 
glob matching one.
   
   I'm now gonna open an issue on the `gcloud-aio` repository and see if they 
are willing to add this functionality, otherwise i guess the only viable 
approach would be to implement it ourselves. 
   
   Basically it's the same issue that you had to solve implementing 
[`_list_blobs_with_match_glob`](https://github.com/apache/airflow/blob/b9acffa81bf61dcf0c5553942c52629c7f75ebe2/airflow/providers/google/cloud/hooks/gcs.py#L847)
 but for the async version of it. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to