[
https://issues.apache.org/jira/browse/HADOOP-13760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15993965#comment-15993965
]
Aaron Fabbri commented on HADOOP-13760:
---------------------------------------
Sorry for the long comment here but I want to think through this issue:
There is a lot going on here WRT rename(). One of tricker parts of it seems to
be the need to iterate over the S3 view of objects while taking into account
additions and deletions that MetadataStore knows about.
rename() has to drive two interfaces which have different views of the
filesystem:
- (A) MetadataStore: uses the FileSystem view of objects
- (B) S3 view: the FileSystem view, minus non-empty "inferred" directories.
Since S3 only has objects for empty directories, the view is a subset of A.
Given an enumeration of a subtree for B, we know how to map to A. We can
create a new enumeration which simply takes the set from B and adds
intermediate directories (non-leaf === non-empty). e.g. if we see an object at
/a/b/c/d/file when moving subtree rooted at /a/b, we know we also need to add
/a/b, /a/b/c, and /a/b/c/d to the S3-object view (B) to get the filesystem view
(A). [~liuml07]'s recently added code does this.
Maybe we can reduce the problem to coming up with a MetadataStore-aware version
of (B):
- (B') The set of S3 objects (rooted at some path P), union with all objects in
MetadataStore under P, minus any marked deleted in MetadataStore.
When we start to think about the invariant that S3 only contains directory
objects for empty directories, things get complicated. If we assume that the
invariant is already maintained (dir blobs were already deleted when they
became non-empty, or created when became empty) the problem is simplified.
This seems to imply things like {{createFakeDirectoryIfNecessary()}} and
{{deleteUnnecessaryFakeDirectories()}} may need to write their changes to
MetadataStore so eventual S3 list consistency doesn't give us an older view of
the S3 objects which violates the empty directory blob invariant.
MetadataStore, however, doesn't record the S3 view where non-empty directories
must not be present. There are two cases we care about, happening before our
rename():
1. An s3 empty directory blob was deleted (became non-empty). The
MetadataStore will know the directory is non-empty since it recorded the
creation of the child entry.
2. An s3 empty directory blob was created (became empty). This may not be
recorded in the MetadataStore. (This is the Tristate.UNKOWN case: Unless the
FS client has reported the dir empty with isAuthoritative=true, *and* the
MetadataStore implementation supports authoritative listings, it will answer
UNKNOWN to a query whether or not the directory is empty: There may be files
in the S3 bucket it does not know about yet.)
#2 would manifest as an empty directory disappearing after a recursive
rename(): Since the empty blob hit eventual consistency in S3, the rename
iteration did not see the empty directory blob and thus does not copy it to the
destination (nor delete it from the source). This bug/feature exists today
without S3Guard so we may just leave it as a caveat for now. The solution may
be to add authoritative listing support to DynamoDBMetadataStore.
> S3Guard: add delete tracking
> ----------------------------
>
> Key: HADOOP-13760
> URL: https://issues.apache.org/jira/browse/HADOOP-13760
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3
> Reporter: Aaron Fabbri
> Assignee: Sean Mackrory
> Attachments: HADOOP-13760-HADOOP-13345.001.patch,
> HADOOP-13760-HADOOP-13345.002.patch, HADOOP-13760-HADOOP-13345.003.patch,
> HADOOP-13760-HADOOP-13345.004.patch
>
>
> Following the S3AFileSystem integration patch in HADOOP-13651, we need to add
> delete tracking.
> Current behavior on delete is to remove the metadata from the MetadataStore.
> To make deletes consistent, we need to add a {{isDeleted}} flag to
> {{PathMetadata}} and check it when returning results from functions like
> {{getFileStatus()}} and {{listStatus()}}. In HADOOP-13651, I added TODO
> comments in most of the places these new conditions are needed. The work
> does not look too bad.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]