Lavedonio opened a new issue, #24352: URL: https://github.com/apache/airflow/issues/24352
### Apache Airflow Provider(s) google ### Versions of Apache Airflow Providers All versions. ``` apache-airflow-providers-google>=1.0.0b1 apache-airflow-backport-providers-google>=2020.5.20rc1 ``` ### Apache Airflow version 2.3.2 (latest released) ### Operating System macOS 12.3.1 ### Deployment Composer ### Deployment details _No response_ ### What happened I'm currently doing the upgrade check in Airflow 1.10.15 and one of the topics is to change the import locations from contrib to the specific provider. While replacing: `airflow.contrib.operators.gcs_delete_operator.GoogleCloudStorageDeleteOperator` By: `airflow.providers.google.cloud.operators.gcs.GCSDeleteObjectsOperator` An error appeared in the UI: `Broken DAG: [...] Either object or prefix should be set. Both are None` --- Upon further investigation, I found out that while the `GoogleCloudStorageDeleteOperator` from contrib module had this parameter check (as can be seen [here](https://github.com/apache/airflow/blob/v1-10-stable/airflow/contrib/operators/gcs_delete_operator.py#L63)): ```python assert objects is not None or prefix is not None ``` The new `GCSDeleteObjectsOperator` from Google provider module the following (as can be seen [here](https://github.com/apache/airflow/blob/main/airflow/providers/google/cloud/operators/gcs.py#L308-L309)): ```python if not objects and not prefix: raise ValueError("Either object or prefix should be set. Both are None") ``` --- As it turns out, these conditions are not equivalent, because a variable `prefix` containing the value of an empty string won't raise an error on the first case, but will raise it in the second one. ### What you think should happen instead This behavior does not match with the documentation description, since using a prefix as an empty string is perfectly valid in case the user wants to delete all objects within the bucket. Furthermore, there were no philosophical changes within the API in that timeframe. This code change happened in [this commit](https://github.com/apache/airflow/commit/25e9047a4a4da5fad4f85c366e3a6262c0a4f68e#diff-c45d838a139b258ab703c23c30fd69078108f14a267731bd2be5cc1c8a7c02f5), where the developer's intent was clearly to remove assertions, not to change the logic behind the validation. In fact, it even relates to a PR for [this Airflow JIRA ticket](https://issues.apache.org/jira/browse/AIRFLOW-6193). ### How to reproduce Add a `GCSDeleteObjectsOperator` with a parameter `prefix=""` to a DAG. Example: ```python from datetime import datetime, timedelta from airflow import DAG from airflow.providers.google.cloud.operators.gcs import GCSDeleteObjectsOperator with DAG('test_dag', schedule_interval=timedelta(days=1), start_date=datetime(2022, 1, 1)) as dag: task = GCSDeleteObjectsOperator( task_id='task_that_generates_ValueError', bucket_name='some_bucket', prefix='' ) ``` ### Anything else In my opinion, the error message wasn't very accurate as well, since it just breaks the DAG without pointing out which task is causing the issue. It took me 20 minutes to pinpoint the exact task in my case, since I was dealing with a DA with a lot of tasks. Adding the `task_id` to the error message could improve the developer experience in that case. ### Are you willing to submit PR? - [X] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
