[ 
https://issues.apache.org/jira/browse/HADOOP-13760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15979405#comment-15979405
 ] 

Aaron Fabbri commented on HADOOP-13760:
---------------------------------------

Ok.. looking at v2 patch now.  Just to set expectations, this is a complex 
little feature and I expect it to take some time to get right.  Some initial 
questions here:

1. It looks like you make significant changes to rename.

- This makes me nervous (rename should make anybody nervous, it is healthy).  
I'd suggest we break it out to separate "pre-refactor" patch if we still need 
it.  In general the more we can split out pre-refactoring the better.  It can 
be done later when the patch is closer to complete.

- You change from iterating over S3's ObjectListing (which is subset of actual 
directory subtree vertices), to iterating over actual directory tree via 
listFilesAndDirectories().  I think the problem here is that you will be (A) 
deleting directory blobs that don't exist on S3, and (B) creating directory 
blobs that *should not* exist on S3.  I'd expect this rename code to cause 
destination subtrees to disappear, as those directory keys you write would be 
interpreted as empty dirs by S3A later on.

*If* I'm correct on the last point, I hope there is a integration or unit test 
that fails with this code.  If not we should create one.  

2. In innerMkdirs(), i think the try/catch block around your 
checkPathForDirectory() is no longer needed.  You use return values, not FNF 
exception.

3. Switch statements should probably just be if / else in these cases.  Any 
savings you had were given up having to add a fallthrough findbugs exclusion 
IMO.

I'm continuing to look at the patch but wanted to get you some early feedback.


> S3Guard: add delete tracking
> ----------------------------
>
>                 Key: HADOOP-13760
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13760
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>            Reporter: Aaron Fabbri
>            Assignee: Sean Mackrory
>         Attachments: HADOOP-13760-HADOOP-13345.001.patch, 
> HADOOP-13760-HADOOP-13345.002.patch
>
>
> Following the S3AFileSystem integration patch in HADOOP-13651, we need to add 
> delete tracking.
> Current behavior on delete is to remove the metadata from the MetadataStore.  
> To make deletes consistent, we need to add a {{isDeleted}} flag to 
> {{PathMetadata}} and check it when returning results from functions like 
> {{getFileStatus()}} and {{listStatus()}}.  In HADOOP-13651, I added TODO 
> comments in most of the places these new conditions are needed.  The work 
> does not look too bad.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to