[jira] [Commented] (HADOOP-19474) ABFS: [FnsOverBlob] Listing Optimizations to avoid multiple iteration over list response.

ASF GitHub Bot (Jira) Wed, 26 Mar 2025 05:20:06 -0700


    [ 
https://issues.apache.org/jira/browse/HADOOP-19474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17938585#comment-17938585
 ]


ASF GitHub Bot commented on HADOOP-19474:
-----------------------------------------

anujmodi2021 commented on code in PR #7421:
URL: https://github.com/apache/hadoop/pull/7421#discussion_r2014014171


##########
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsDfsClient.java:
##########
@@ -1464,20 +1472,41 @@ public Hashtable<String, String> 
getXMSProperties(AbfsHttpOperation result)
 
   /**
    * Parse the list file response from DFS ListPath API in Json format
-   * @param stream InputStream contains the list results.
-   * @throws IOException if parsing fails.
+   * @param result InputStream contains the list results.
+   * @param uri to be used for path conversion.
+   * @return {@link ListResponseData}. containing listing response.
+   * @throws AzureBlobFileSystemException if parsing fails.
    */
   @Override
-  public ListResultSchema parseListPathResults(final InputStream stream) 
throws IOException {
-    DfsListResultSchema listResultSchema;
-    try {
-      final ObjectMapper objectMapper = new ObjectMapper();
-      listResultSchema = objectMapper.readValue(stream, 
DfsListResultSchema.class);
+  public ListResponseData parseListPathResults(AbfsHttpOperation result, URI 
uri) throws AzureBlobFileSystemException {
+    try (InputStream listResultInputStream = result.getListResultStream()) {
+      DfsListResultSchema listResultSchema;
+      try {
+        final ObjectMapper objectMapper = new ObjectMapper();
+        listResultSchema = objectMapper.readValue(listResultInputStream,
+            DfsListResultSchema.class);
+        result.setListResultSchema(listResultSchema);
+        LOG.debug("ListPath listed {} paths with {} as continuation token",
+            listResultSchema.paths().size(),
+            getContinuationFromResponse(result));
+      } catch (IOException ex) {
+        throw new AbfsDriverException(ex);
+      }
+
+      List<FileStatus> fileStatuses = new ArrayList<>();
+      for (DfsListResultEntrySchema entry : listResultSchema.paths()) {
+        fileStatuses.add(getVersionedFileStatusFromEntry(entry, uri));
+      }
+      ListResponseData listResponseData = new ListResponseData();
+      listResponseData.setFileStatusList(fileStatuses);
+      listResponseData.setRenamePendingJsonPaths(null);
+      listResponseData.setContinuationToken(
+          getContinuationFromResponse(result));
+      return listResponseData;
     } catch (IOException ex) {
       LOG.error("Unable to deserialize list results", ex);

Review Comment:
   Taken





> ABFS: [FnsOverBlob] Listing Optimizations to avoid multiple iteration over 
> list response.
> -----------------------------------------------------------------------------------------
>
>                 Key: HADOOP-19474
>                 URL: https://issues.apache.org/jira/browse/HADOOP-19474
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/azure
>    Affects Versions: 3.5.0, 3.4.1
>            Reporter: Anuj Modi
>            Assignee: Anuj Modi
>            Priority: Major
>              Labels: pull-request-available
>
> On blob endpoint, there are a couple of handling that is needed to be done on 
> client side.
> This involves:
>  # Parsing of xml response and converting them to VersionedFileStatus list
>  # Removing duplicate entries for non-empty explicit directories coming due 
> to presence of the marker files
>  # Trigerring Rename recovery on the previously failed rename indicated by 
> the presence of pending json file.
> Currently all three are done in a separate iteration over whole list. This is 
> to pbring all those things to a common place so that single iteration over 
> list reposne can handle all three.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (HADOOP-19474) ABFS: [FnsOverBlob] Listing Optimizations to avoid multiple iteration over list response.

Reply via email to