[ https://issues.apache.org/jira/browse/HADOOP-17215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17946658#comment-17946658 ]
Manish Bhatt commented on HADOOP-17215: --------------------------------------- [~mthakur] This change was implemented to mitigate the non-idempotency of the method. Specifically, in the event of a network error or similar issue, the client may retry the request, potentially resulting in the file being created more than once, which could lead to data loss. Could you please share the specific scenario that is causing this issue? Why are there so many parallel create requests? If you intend to honor every request regardless of prior existence, you may consider disabling the conditional create flag and proceeding accordingly. > ABFS: Support for conditional overwrite > --------------------------------------- > > Key: HADOOP-17215 > URL: https://issues.apache.org/jira/browse/HADOOP-17215 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure > Affects Versions: 3.3.0 > Reporter: Sneha Vijayarajan > Assignee: Sneha Vijayarajan > Priority: Major > Labels: abfsactive > Fix For: 3.3.1, 3.4.0 > > > Filesystem Create APIs that do not accept an argument for overwrite flag end > up defaulting it to true. > We are observing that request count of creates with overwrite=true is more > and primarily because of the default setting of the flag is true of the > called Create API. When a create with overwrite ends up timing out, we have > observed that it could lead to race conditions between the first create and > retried one running almost parallel. > To avoid this scenario for create with overwrite=true request, ABFS driver > will always attempt to create without overwrite. If the create fails due to > fileAlreadyPresent, it will resend the request with overwrite=true. > > > -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org