[ 
https://issues.apache.org/jira/browse/HADOOP-16090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitri Chmelev updated HADOOP-16090:
------------------------------------
    Description: 
The fix to avoid calls to getFileStatus() for each path component in 
deleteUnnecessaryFakeDirectories() (HADOOP-13164) results in accumulation of 
delete markers in versioned S3 buckets. The above patch replaced 
getFileStatus() checks with a single batch delete request formed by generating 
all ancestor keys formed from a given path. Since the delete request is not 
checking for existence of fake directories, it will create a delete marker for 
every path component that did not exist (or was previously deleted). Note that 
issuing a DELETE request without specifying a version ID will always create a 
new delete marker, even if one already exists ([AWS S3 Developer 
Guide|https://docs.aws.amazon.com/AmazonS3/latest/dev/RemDelMarker.html])

Since deleteUnnecessaryFakeDirectories() is called as a callback on successful 
writes and on renames, delete markers accumulate rather quickly and their rate 
of accumulation is inversely proportional to the depth of the path. In other 
words, directories closer to the root will have more delete markers than the 
leaves.

This behavior negatively impacts performance of getFileStatus() operation when 
it has to issue listObjects() request (especially v1) as the delete markers 
have to be examined when the request searches for first current non-deleted 
version of an object following a given prefix.

I did a quick comparison against 3.x and the issue is still present: 
[https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java#L2947|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java#L2947]

 

  was:
The fix to avoid calls to getFileStatus() for each path component in 
deleteUnnecessaryFakeDirectories() 
([HADOOP-13164|https://issues.apache.org/jira/browse/HADOOP-13164]) results in 
accumulation of delete markers in versioned S3 buckets. The above patch 
replaced getFileStatus() checks with a single batch delete request formed by 
generating all ancestor keys formed from a given path. Since the delete request 
is not checking for existence of fake directories, it will create a delete 
marker for every path component that did not exist (or was previously deleted). 
Note that issuing a DELETE request without specifying a version ID will always 
create a new delete marker, even if one already exists ([AWS S3 Developer 
Guide|https://docs.aws.amazon.com/AmazonS3/latest/dev/RemDelMarker.html])

Since deleteUnnecessaryFakeDirectories() is called as a callback on successful 
writes and on renames, delete markers accumulate rather quickly and their rate 
of accumulation is inversely proportional to the depth of the path. In other 
words, directories closer to the root will have more delete markers than the 
leaves.

This behavior negatively impacts performance of getFileStatus() operation when 
it has to issue listObjects() request (especially v1) as the delete markers 
have to be examined when the request searches for first current non-deleted 
version of an object following a given prefix.

I did a quick comparison against 3.x and the issue is still present: 
[https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java#L2947|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java#L2947]

 


> deleteUnnecessaryFakeDirectories() creates unnecessary delete markers in a 
> versioned S3 bucket
> ----------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-16090
>                 URL: https://issues.apache.org/jira/browse/HADOOP-16090
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs/s3
>    Affects Versions: 2.8.1
>            Reporter: Dmitri Chmelev
>            Priority: Minor
>
> The fix to avoid calls to getFileStatus() for each path component in 
> deleteUnnecessaryFakeDirectories() (HADOOP-13164) results in accumulation of 
> delete markers in versioned S3 buckets. The above patch replaced 
> getFileStatus() checks with a single batch delete request formed by 
> generating all ancestor keys formed from a given path. Since the delete 
> request is not checking for existence of fake directories, it will create a 
> delete marker for every path component that did not exist (or was previously 
> deleted). Note that issuing a DELETE request without specifying a version ID 
> will always create a new delete marker, even if one already exists ([AWS S3 
> Developer 
> Guide|https://docs.aws.amazon.com/AmazonS3/latest/dev/RemDelMarker.html])
> Since deleteUnnecessaryFakeDirectories() is called as a callback on 
> successful writes and on renames, delete markers accumulate rather quickly 
> and their rate of accumulation is inversely proportional to the depth of the 
> path. In other words, directories closer to the root will have more delete 
> markers than the leaves.
> This behavior negatively impacts performance of getFileStatus() operation 
> when it has to issue listObjects() request (especially v1) as the delete 
> markers have to be examined when the request searches for first current 
> non-deleted version of an object following a given prefix.
> I did a quick comparison against 3.x and the issue is still present: 
> [https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java#L2947|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java#L2947]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to