[ 
https://issues.apache.org/jira/browse/HADOOP-17215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Marqardt updated HADOOP-17215:
-------------------------------------
    Fix Version/s: 3.4.0
     Release Note: ABFS: Support for conditional overwrite.
       Resolution: Fixed
           Status: Resolved  (was: Patch Available)

commit e31a636e922a8fdbe0aa7cca53f6de7175e97254
Author: Sneha Vijayarajan <[email protected]>
Date: Wed Aug 26 00:31:35 2020 +0530

HADOOP-17215: Support for conditional overwrite.

Contributed by Sneha Vijayarajan

 

DETAILS:

This change adds config key "fs.azure.enable.conditional.create.overwrite" with
 a default of true. When enabled, if create(path, overwrite: true) is invoked
 and the file exists, the ABFS driver will first obtain its etag and then 
attempt
 to overwrite the file on the condition that the etag matches. The purpose of 
this
 is to mitigate the non-idempotency of this method. Specifically, in the event 
of
 a network error or similar, the client will retry and this can result in the 
file
 being created more than once which may result in data loss. In essence this is
 like a poor man's file handle, and will be addressed more thoroughly in the 
future
 when support for lease is added to ABFS.

> ABFS: Support for conditional overwrite
> ---------------------------------------
>
>                 Key: HADOOP-17215
>                 URL: https://issues.apache.org/jira/browse/HADOOP-17215
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/azure
>    Affects Versions: 3.3.0
>            Reporter: Sneha Vijayarajan
>            Assignee: Sneha Vijayarajan
>            Priority: Major
>              Labels: abfsactive
>             Fix For: 3.4.0
>
>
> Filesystem Create APIs that do not accept an argument for overwrite flag end 
> up defaulting it to true. 
> We are observing that request count of creates with overwrite=true is more 
> and primarily because of the default setting of the flag is true of the 
> called Create API. When a create with overwrite ends up timing out, we have 
> observed that it could lead to race conditions between the first create and 
> retried one running almost parallel.
> To avoid this scenario for create with overwrite=true request, ABFS driver 
> will always attempt to create without overwrite. If the create fails due to 
> fileAlreadyPresent, it will resend the request with overwrite=true. 
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to