[
https://issues.apache.org/jira/browse/HADOOP-16090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16761739#comment-16761739
]
Steve Loughran commented on HADOOP-16090:
-----------------------------------------
I've been thinking for a while about having an operation context follow those
long-lived ops, one which would get created when the write or read is kicked
off, and which follows it to the end. A number of reasons
# I've promised the flink team a low cost "don't check for existence or delete
parent dirs" option for higher performance IO: essentially a "we know what we
are doing just give us the PUT" operation, accessed by the new createFile()
builder.
# lets us pass down an htrace (successor) context which would then be added to
the UA header of each operation.
There's already an {{S3AReadOpContext}} for reading; a mirror for writing with
some base class for common state (context ID) would go all the way through the
work.
If the existence of the file and parent dir were added to the write context,
they'd know not to worry.
BTW, just been warned this really hurts distcp uploads. We do need to fix this,
though moving any of this stuff back to branch 2 is pretty unlikely, especially
once you get into passing a context around.
An initial patch from you for HEAD of all parent entries is not the lowest
cost, but the only one I could really imagine backporting
> deleteUnnecessaryFakeDirectories() creates unnecessary delete markers in a
> versioned S3 bucket
> ----------------------------------------------------------------------------------------------
>
> Key: HADOOP-16090
> URL: https://issues.apache.org/jira/browse/HADOOP-16090
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3
> Affects Versions: 2.8.1
> Reporter: Dmitri Chmelev
> Priority: Minor
>
> The fix to avoid calls to getFileStatus() for each path component in
> deleteUnnecessaryFakeDirectories() (HADOOP-13164) results in accumulation of
> delete markers in versioned S3 buckets. The above patch replaced
> getFileStatus() checks with a single batch delete request formed by
> generating all ancestor keys formed from a given path. Since the delete
> request is not checking for existence of fake directories, it will create a
> delete marker for every path component that did not exist (or was previously
> deleted). Note that issuing a DELETE request without specifying a version ID
> will always create a new delete marker, even if one already exists ([AWS S3
> Developer
> Guide|https://docs.aws.amazon.com/AmazonS3/latest/dev/RemDelMarker.html])
> Since deleteUnnecessaryFakeDirectories() is called as a callback on
> successful writes and on renames, delete markers accumulate rather quickly
> and their rate of accumulation is inversely proportional to the depth of the
> path. In other words, directories closer to the root will have more delete
> markers than the leaves.
> This behavior negatively impacts performance of getFileStatus() operation
> when it has to issue listObjects() request (especially v1) as the delete
> markers have to be examined when the request searches for first current
> non-deleted version of an object following a given prefix.
> I did a quick comparison against 3.x and the issue is still present:
> [https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java#L2947|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java#L2947]
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]