kevgeo commented on code in PR #37237:
URL: https://github.com/apache/airflow/pull/37237#discussion_r1482467030


##########
airflow/providers/google/cloud/hooks/gcs.py:
##########
@@ -1295,37 +1300,45 @@ def _prepare_sync_plan(
         destination_object: str | None,
         recursive: bool,
     ) -> tuple[set[storage.Blob], set[storage.Blob], set[storage.Blob]]:
-        # Calculate the number of characters that remove from the name, 
because they contain information
+        # Calculate the number of characters that are removed from the name, 
because they contain information
         # about the parent's path
         source_object_prefix_len = len(source_object) if source_object else 0
         destination_object_prefix_len = len(destination_object) if 
destination_object else 0
         delimiter = "/" if not recursive else None
+
         # Fetch blobs list
         source_blobs = list(source_bucket.list_blobs(prefix=source_object, 
delimiter=delimiter))
         destination_blobs = list(
             destination_bucket.list_blobs(prefix=destination_object, 
delimiter=delimiter)
         )
+
         # Create indexes that allow you to identify blobs based on their name
         source_names_index = {a.name[source_object_prefix_len:]: a for a in 
source_blobs}
         destination_names_index = {a.name[destination_object_prefix_len:]: a 
for a in destination_blobs}
+
         # Create sets with names without parent object name
         source_names = set(source_names_index.keys())
+        # Discards empty string that creates an empty source subdirectory

Review Comment:
   No, I meant that in the `source_names` set, there is an empty string. This 
causes the destination object value to be assigned as source 
object(subdirectory) as seen in the 
[rewrite](https://github.com/apache/airflow/blob/1e4d55777b556c395ff4156e4b68090b9f1e2c6f/airflow/providers/google/cloud/hooks/gcs.py#L236)
 function, hence creating an empty subdirectory in the destination bucket.
   
   This is the reason why I am removing the empty string.
   
   Maybe I can reformat the comment as,
   `# Discards empty string from source set that creates an empty subdirectory 
in destination bucket with source subdirectory name`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to