josh-fell commented on code in PR #29694:
URL: https://github.com/apache/airflow/pull/29694#discussion_r1123963291


##########
airflow/providers/google/suite/hooks/drive.py:
##########
@@ -243,8 +300,21 @@ def upload_file(
             .create(body=file_metadata, media_body=media, fields="id", 
supportsAllDrives=True)
             .execute(num_retries=self.num_retries)
         )
-        self.log.info("File %s uploaded to gdrive://%s.", local_location, 
remote_location)
-        return file.get("id")
+        file_id = file.get("id")
+
+        upload_location = remote_location
+
+        if folder_id != "root":
+            try:
+                upload_location = self._resolve_file_path(folder_id)
+            except (GoogleApiClientError, AirflowException) as e:

Review Comment:
   I agree with @aru-trackunit. I don't think the task should fail simply 
because the file path can't be resolved for logging purposes. It's unfortunate 
the Google API doesn't surface this information easily. As long as the task 
does what it's supposed to do, Airflow shouldn't care what is in its 
user-facing task logs. Having this logic configurable also helps too.
   
   I do agree with @eladkal though that `AirflowException` is irrelevant. The 
operator is not failing for some orchestration or execution-related issue (IMO 
I think we use `AirflowException` with too broad of a brush sometimes). Maybe a 
custom exception is raised (e.g. `NestedFolderLimitReachedException` or 
something similar) just in case some real AirflowException is swallowed 
unknowingly?



##########
airflow/providers/google/suite/hooks/drive.py:
##########
@@ -159,6 +161,59 @@ def exists(
             )
         )
 
+    def _get_file_info(self, file_id: str):
+        """
+        Returns Google API file_info object containing id, name, parents in 
the response
+        https://developers.google.com/drive/api/v3/reference/files/get
+
+        :param file_id: id as string representation of interested file
+        :return: file
+        """
+        file_info = (
+            self.get_conn()
+            .files()
+            .get(
+                fileId=file_id,
+                fields="id,name,parents",
+                supportsAllDrives=True,
+            )
+            .execute(num_retries=2)
+        )
+        return file_info
+
+    def _resolve_file_path(self, file_id: str) -> str:
+        """
+        Returns the full Google Drive path for given file_id
+
+        :param file_id: The id of a file in Google Drive
+        :return: Google Drive full path for a file
+        """
+        MAX_NESTED_FOLDERS_LEVEL = 20  # Link to docs 
https://support.google.com/a/users/answer/7338880?hl=en

Review Comment:
   This still feels magical. But, out of curiosity, if the documentation states 
"A folder in a shared drive can have up to 20 levels of nested folders", does 
that mean users _can't_ physically create more than that number of nested 
folders or is it really a _shouldn't_ create situation?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to