[ https://issues.apache.org/jira/browse/HADOOP-19233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17922691#comment-17922691 ]
ASF GitHub Bot commented on HADOOP-19233: ----------------------------------------- anujmodi2021 commented on code in PR #7265: URL: https://github.com/apache/hadoop/pull/7265#discussion_r1937087835 ########## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/AzureBlobFileSystemStore.java: ########## @@ -1155,9 +1169,10 @@ public void delete(final Path path, final boolean recursive, do { try (AbfsPerfInfo perfInfo = startTracking("delete", "deletePath")) { AbfsRestOperation op = getClient().deletePath(relativePath, recursive, - continuation, tracingContext, getIsNamespaceEnabled(tracingContext)); + continuation, tracingContext); perfInfo.registerResult(op.getResult()); - continuation = op.getResult().getResponseHeader(HttpHeaderConfigurations.X_MS_CONTINUATION); + continuation = op.getResult() Review Comment: Avoid diffs ########## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/AzureBlobFileSystemStore.java: ########## @@ -1144,9 +1158,9 @@ public void delete(final Path path, final boolean recursive, boolean shouldContinue = true; LOG.debug("delete filesystem: {} path: {} recursive: {}", - getClient().getFileSystem(), - path, - String.valueOf(recursive)); + getClient().getFileSystem(), Review Comment: Avoid diffs ########## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/AzureBlobFileSystemStore.java: ########## @@ -673,7 +694,7 @@ public OutputStream createFile(final Path path, } final ContextEncryptionAdapter contextEncryptionAdapter; - if (createClient.getEncryptionType() == EncryptionType.ENCRYPTION_CONTEXT) { + if (getClient().getEncryptionType() == EncryptionType.ENCRYPTION_CONTEXT) { Review Comment: Why are we changing this. This seems wrong. Right @anmolanmol1234 ########## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/AzureBlobFileSystemStore.java: ########## @@ -1114,15 +1129,14 @@ public boolean rename(final Path source, do { try (AbfsPerfInfo perfInfo = startTracking("rename", "renamePath")) { - boolean isNamespaceEnabled = getIsNamespaceEnabled(tracingContext); final AbfsClientRenameResult abfsClientRenameResult = getClient().renamePath(sourceRelativePath, destinationRelativePath, - continuation, tracingContext, sourceEtag, false, - isNamespaceEnabled); + continuation, tracingContext, sourceEtag, false); AbfsRestOperation op = abfsClientRenameResult.getOp(); perfInfo.registerResult(op.getResult()); - continuation = op.getResult().getResponseHeader(HttpHeaderConfigurations.X_MS_CONTINUATION); + continuation = op.getResult() Review Comment: Avoid diffs ########## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsBlobClient.java: ########## @@ -1694,16 +1976,24 @@ private boolean isNonEmptyListing(String path, * @return True if empty results without continuation token. */ private boolean isEmptyListResults(AbfsHttpOperation result) { Review Comment: Why diffs in this method, please avoid unneceessary diffs. I agree that they don't break anything but the PR is already big. It should not touch the code it is not supposed to do. For general improvements please raise another Jira and PR. ########## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/AzureBlobFileSystem.java: ########## @@ -18,6 +18,7 @@ package org.apache.hadoop.fs.azurebfs; +import javax.annotation.Nullable; Review Comment: Import ordering looks wrong. Causing unneccessary diff as well. ########## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsClient.java: ########## @@ -42,25 +42,42 @@ import java.util.concurrent.TimeUnit; import java.util.concurrent.atomic.AtomicBoolean; +import org.slf4j.Logger; Review Comment: Thanks for fixing import ordering in this classs > ABFS: [FnsOverBlob] Implementing Rename and Delete APIs over Blob Endpoint > -------------------------------------------------------------------------- > > Key: HADOOP-19233 > URL: https://issues.apache.org/jira/browse/HADOOP-19233 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure > Affects Versions: 3.4.0 > Reporter: Anuj Modi > Assignee: Manish Bhatt > Priority: Major > Labels: pull-request-available > > Currently, we only support rename and delete operations on the DFS endpoint. > The reason for supporting rename and delete operations on the Blob endpoint > is that the Blob endpoint does not account for hierarchy. We need to ensure > that the HDFS contracts are maintained when performing rename and delete > operations. Renaming or deleting a directory over the Blob endpoint requires > the client to handle the orchestration and rename or delete all the blobs > within the specified directory. > > The task outlines the considerations for implementing rename and delete > operations for the FNS-blob endpoint to ensure compatibility with HDFS > contracts. > * {*}Blob Endpoint Usage{*}: The task addresses the need for abstraction in > the code to maintain HDFS contracts while performing rename and delete > operations on the blob endpoint, which does not support hierarchy. > * {*}Rename Operations{*}: The {{AzureBlobFileSystem#rename()}} method will > use a {{RenameHandler}} instance to handle rename operations, with separate > handlers for the DFS and blob endpoints. This method includes prechecks, > destination adjustments, and orchestration of directory renaming for blobs. > * {*}Atomic Rename{*}: Atomic renaming is essential for blob endpoints, as > it requires orchestration to copy or delete each blob within the directory. A > configuration will allow developers to specify directories for atomic > renaming, with a JSON file to track the status of renames. > * {*}Delete Operations{*}: Delete operations are simpler than renames, > requiring fewer HDFS contract checks. For blob endpoints, the client must > handle orchestration, including managing orphaned directories created by > Az-copy. > * {*}Orchestration for Rename/Delete{*}: Orchestration for rename and delete > operations over blob endpoints involves listing blobs and performing actions > on each blob. The process must be optimized to handle large numbers of blobs > efficiently. > * {*}Need for Optimization{*}: Optimization is crucial because the > {{ListBlob}} API can return a maximum of 5000 blobs at once, necessitating > multiple calls for large directories. The task proposes a producer-consumer > model to handle blobs in parallel, thereby reducing processing time and > memory usage. > * {*}Producer-Consumer Design{*}: The proposed design includes a producer to > list blobs, a queue to store the blobs, and a consumer to process them in > parallel. This approach aims to improve efficiency and mitigate memory issues. > More details will follow > Perquisites for this Patch: > 1. HADOOP-19187 ABFS: [FnsOverBlob]Making AbfsClient Abstract for supporting > both DFS and Blob Endpoint - ASF JIRA (apache.org) > 2. HADOOP-19226 ABFS: [FnsOverBlob]Implementing Azure Rest APIs on Blob > Endpoint for AbfsBlobClient - ASF JIRA (apache.org) > 3. HADOOP-19207 ABFS: [FnsOverBlob]Response Handling of Blob Endpoint APIs > and Metadata APIs - ASF JIRA (apache.org) -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org