[
https://issues.apache.org/jira/browse/HADOOP-18012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17703196#comment-17703196
]
ASF GitHub Bot commented on HADOOP-18012:
-----------------------------------------
sreeb-msft commented on code in PR #5488:
URL: https://github.com/apache/hadoop/pull/5488#discussion_r1143357077
##########
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/AzureBlobFileSystem.java:
##########
@@ -441,11 +441,19 @@ public boolean rename(final Path src, final Path dst)
throws IOException {
return dstFileStatus.isDirectory() ? false : true;
}
+ boolean isNamespaceEnabled =
abfsStore.getIsNamespaceEnabled(tracingContext);
+
// Non-HNS account need to check dst status on driver side.
- if (!abfsStore.getIsNamespaceEnabled(tracingContext) && dstFileStatus ==
null) {
+ if (!isNamespaceEnabled && dstFileStatus == null) {
dstFileStatus = tryGetFileStatus(qualifiedDstPath, tracingContext);
}
+ // for Non-HNS accounts, rename resiliency cannot be maintained
+ // as eTags are not preserved in rename
Review Comment:
@steveloughran here is the change for checking and switching rename
resilience flag in case of FNS accounts.
> ABFS: Enable config controlled ETag check for Rename idempotency
> ----------------------------------------------------------------
>
> Key: HADOOP-18012
> URL: https://issues.apache.org/jira/browse/HADOOP-18012
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/azure
> Affects Versions: 3.3.2
> Reporter: Sneha Vijayarajan
> Assignee: Sree Bhattacharyya
> Priority: Major
> Labels: pull-request-available
>
> ABFS driver has a handling for rename idempotency which relies on LMT of the
> destination file to conclude if the rename was successful or not when source
> file is absent and if the rename request had entered retry loop.
> This handling is incorrect as LMT of the destination does not change on
> rename.
> This Jira will track the change to undo the current implementation and add a
> new one where for an incoming rename operation, source file eTag is fetched
> first and then rename is done only if eTag matches for the source file.
> As this is going to be a costly operation given an extra HEAD request is
> added to each rename, this implementation will be guarded over a config and
> can enabled by customers who have workloads that do multiple renames.
> Long term plan to handle rename idempotency without HEAD request is being
> discussed.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]