eskarimov commented on a change in pull request #17446:
URL: https://github.com/apache/airflow/pull/17446#discussion_r686834461
##########
File path: airflow/providers/google/cloud/hooks/gcs.py
##########
@@ -1207,3 +1204,7 @@ def _parse_gcs_url(gsurl: str) -> Tuple[str, str]:
# Remove leading '/' but NOT trailing one
blob = parsed_url.path.lstrip('/')
return bucket, blob
+
+
+def _normalize_directory_path(source_object: Optional[str]) -> Optional[str]:
+ return source_object + "/" if source_object and not
source_object.endswith("/") else source_object
Review comment:
Hi @uranusjr ,
thank you for reviewing, I'd like to share why I think it'd be better to
keep the function in its current place:
- If we move it under `airflow.utils` it might be treated as a general
function for normalizing directory path (for instance removing redundant
separators or converting slashes to backslashes for Windows, etc), while here
it's purely about having a slash at the end of the path.
- Another options is moving it under `airflow.providers.google.cloud.utils`,
but it seems less logical to me, because the function is related with
normalizing GCS path, I'd expect to find this function under the class/file
related with GCS together with other functions.
Regarding the underscore prefix: I understand and share your point to get
rid of the leading underscore, as we import this function in other modules, so
it doesn't seem like something for internal use.
On another side I've tried to follow the same pattern as the previous
function in the same file
[`_parse_gcs_url`](https://github.com/apache/airflow/blob/8d2d264cdd7bb79b85b2b6124c7fc596ac9142f8/airflow/providers/google/cloud/hooks/gcs.py#L1192-L1206)
- it's used across [BigQuery
](https://github.com/apache/airflow/blob/67cbb0f181f806edb16ca12fb7a2638b5f31eb58/airflow/providers/google/cloud/operators/bigquery.py#L39),
[AzureFileShareToGCSOperator
](https://github.com/apache/airflow/blob/866a601b76e219b3c043e1dbbc8fb22300866351/airflow/providers/google/cloud/transfers/azure_fileshare_to_gcs.py#L24),
[S3ToGCSOperator](https://github.com/apache/airflow/blob/866a601b76e219b3c043e1dbbc8fb22300866351/airflow/providers/google/cloud/transfers/s3_to_gcs.py#L25)
operators. Please let me know your thoughts.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]