[ 
https://issues.apache.org/jira/browse/HADOOP-18656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17832816#comment-17832816
 ] 

ASF GitHub Bot commented on HADOOP-18656:
-----------------------------------------

anujmodi2021 commented on code in PR #6409:
URL: https://github.com/apache/hadoop/pull/6409#discussion_r1546230961


##########
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsClient.java:
##########
@@ -1117,12 +1117,22 @@ public AbfsRestOperation read(final String path,
     return op;
   }
 
-  public AbfsRestOperation deletePath(final String path, final boolean 
recursive, final String continuation,
+  public AbfsRestOperation deletePath(final String path, final boolean 
recursive,
+                                      final String continuation,
                                       TracingContext tracingContext)
           throws AzureBlobFileSystemException {
-    final List<AbfsHttpHeader> requestHeaders = createDefaultHeaders();
-
+    final List<AbfsHttpHeader> requestHeaders
+        = (isPaginatedDelete(tracingContext, recursive)
+        && xMsVersion.compareTo(ApiVersion.AUG_03_2023) < 0)
+        ? createDefaultHeaders(ApiVersion.AUG_03_2023)
+        : createDefaultHeaders(xMsVersion);
     final AbfsUriQueryBuilder abfsUriQueryBuilder = 
createDefaultUriQueryBuilder();
+
+    if (isPaginatedDelete(tracingContext, recursive)) {

Review Comment:
   If this is a paginated delete, then API version change condition will only 
fail if current version is greater than AUG_03_2023. 
   
   In that case we can go ahead with Current API Version only as Azure APIs are 
backward compatible,





> ABFS: Support for Pagination in Recursive Directory Delete 
> -----------------------------------------------------------
>
>                 Key: HADOOP-18656
>                 URL: https://issues.apache.org/jira/browse/HADOOP-18656
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/azure
>    Affects Versions: 3.3.5
>            Reporter: Sree Bhattacharyya
>            Assignee: Anuj Modi
>            Priority: Minor
>              Labels: pull-request-available
>
> Today, when a recursive delete is issued for a large directory in ADLS Gen2 
> (HNS) account, the directory deletion happens in O(1) but in backend ACL 
> Checks are done recursively for each object inside that directory which in 
> case of large directory could lead to request time out. Pagination is 
> introduced in the Azure Storage Backend for these ACL checks.
> More information on how pagination works can be found on public documentation 
> of [Azure Delete Path 
> API|https://learn.microsoft.com/en-us/rest/api/storageservices/datalakestoragegen2/path/delete?view=rest-storageservices-datalakestoragegen2-2019-12-12].
> This PR contains changes to support this from client side. To trigger 
> pagination, client needs to add a new query parameter "paginated" and set it 
> to true along with recursive set to true. In return if the directory is 
> large, server might return a continuation token back to the caller. If caller 
> gets back a continuation token, it has to call the delete API again with 
> continuation token along with recursive and pagination set to true. This is 
> similar to directory delete of FNS account.
> Pagination is available only in versions "2023-08-03" onwards.
> PR also contains functional tests to verify driver works well with different 
> combinations of recursive and pagination features for both HNS and FNS 
> account.
> Full E2E testing of pagination requires large dataset to be created and hence 
> not added as part of driver test suite. But extensive E2E testing has been 
> performed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to