[ 
https://issues.apache.org/jira/browse/HADOOP-18012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-18012:
------------------------------------
    Description: 
To support recovery of comms failure during rename, the abfs client fetches the 
etag of the source file, and when recovering from a failure uses this tag to 
determine whether the rename succeeded *before the failure happened*

# This works for files, but not directories
# this adds the overhead of a HEAD request before each rename.
# the option can be disabled by setting "fs.azure.enable.rename.resilience" to 
false

Note: the manifest committer collects etags during task commitklkl and supplies 
them to the abfs client for the rename, which avoids the need for a HEAD call. 

  was:
ABFS driver has a handling for rename idempotency which relies on LMT of the 
destination file to conclude if the rename was successful or not when source 
file is absent and if the rename request had entered retry loop.

This handling is incorrect as LMT of the destination does not change on rename. 

This Jira will track the change to undo the current implementation and add a 
new one where for an incoming rename operation, source file eTag is fetched 
first and then rename is done only if eTag matches for the source file.

As this is going to be a costly operation given an extra HEAD request is added 
to each rename, this implementation will be guarded over a config and can 
enabled by customers who have workloads that do multiple renames. 

Long term plan to handle rename idempotency without HEAD request is being 
discussed.


> ABFS: Enable config controlled ETag check for Rename idempotency
> ----------------------------------------------------------------
>
>                 Key: HADOOP-18012
>                 URL: https://issues.apache.org/jira/browse/HADOOP-18012
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/azure
>    Affects Versions: 3.3.2
>            Reporter: Sneha Vijayarajan
>            Assignee: Sree Bhattacharyya
>            Priority: Major
>              Labels: pull-request-available
>
> To support recovery of comms failure during rename, the abfs client fetches 
> the etag of the source file, and when recovering from a failure uses this tag 
> to determine whether the rename succeeded *before the failure happened*
> # This works for files, but not directories
> # this adds the overhead of a HEAD request before each rename.
> # the option can be disabled by setting "fs.azure.enable.rename.resilience" 
> to false
> Note: the manifest committer collects etags during task commitklkl and 
> supplies them to the abfs client for the rename, which avoids the need for a 
> HEAD call. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to