[
https://issues.apache.org/jira/browse/HADOOP-13760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15979405#comment-15979405
]
Aaron Fabbri commented on HADOOP-13760:
---------------------------------------
Ok.. looking at v2 patch now. Just to set expectations, this is a complex
little feature and I expect it to take some time to get right. Some initial
questions here:
1. It looks like you make significant changes to rename.
- This makes me nervous (rename should make anybody nervous, it is healthy).
I'd suggest we break it out to separate "pre-refactor" patch if we still need
it. In general the more we can split out pre-refactoring the better. It can
be done later when the patch is closer to complete.
- You change from iterating over S3's ObjectListing (which is subset of actual
directory subtree vertices), to iterating over actual directory tree via
listFilesAndDirectories(). I think the problem here is that you will be (A)
deleting directory blobs that don't exist on S3, and (B) creating directory
blobs that *should not* exist on S3. I'd expect this rename code to cause
destination subtrees to disappear, as those directory keys you write would be
interpreted as empty dirs by S3A later on.
*If* I'm correct on the last point, I hope there is a integration or unit test
that fails with this code. If not we should create one.
2. In innerMkdirs(), i think the try/catch block around your
checkPathForDirectory() is no longer needed. You use return values, not FNF
exception.
3. Switch statements should probably just be if / else in these cases. Any
savings you had were given up having to add a fallthrough findbugs exclusion
IMO.
I'm continuing to look at the patch but wanted to get you some early feedback.
> S3Guard: add delete tracking
> ----------------------------
>
> Key: HADOOP-13760
> URL: https://issues.apache.org/jira/browse/HADOOP-13760
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3
> Reporter: Aaron Fabbri
> Assignee: Sean Mackrory
> Attachments: HADOOP-13760-HADOOP-13345.001.patch,
> HADOOP-13760-HADOOP-13345.002.patch
>
>
> Following the S3AFileSystem integration patch in HADOOP-13651, we need to add
> delete tracking.
> Current behavior on delete is to remove the metadata from the MetadataStore.
> To make deletes consistent, we need to add a {{isDeleted}} flag to
> {{PathMetadata}} and check it when returning results from functions like
> {{getFileStatus()}} and {{listStatus()}}. In HADOOP-13651, I added TODO
> comments in most of the places these new conditions are needed. The work
> does not look too bad.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]