[
https://issues.apache.org/jira/browse/HADOOP-16380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16889127#comment-16889127
]
Steve Loughran commented on HADOOP-16380:
-----------------------------------------
no problem...at least we haven't shipped it.
I still worry we do have an issue where, for a directory
S3Guard has
* a tombstone with a name N
* No other children
* Dir not tagged as auth
S3 has
* A file with name P > N
* something under the tombstone path N
The DDB will say "Unknown", kicking off the scan (this isn't a regression BTW),
which will then not see file P as the entry under N gets found first and then
dismissed. We need to consider extra diligence in this special case (make that
a separate JIRA though)
> S3A tombstones can confuse empty directory status
> -------------------------------------------------
>
> Key: HADOOP-16380
> URL: https://issues.apache.org/jira/browse/HADOOP-16380
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3, test
> Affects Versions: 3.2.0, 3.0.3, 3.3.0, 3.1.2
> Reporter: Steve Loughran
> Assignee: Steve Loughran
> Priority: Blocker
>
> If S3AFileSystem does an S3 LIST restricted to a single object to see if a
> directory is empty, and the single entry found has a tombstone marker (either
> from an inconsistent DDB Table or from an eventually consistent LIST) then it
> will consider the directory empty, _even if there is 1+ entry which is not
> deleted_
> We need to make sure the calculation of whether a directory is empty or not
> is resilient to this, efficiently.
> It surfaces as an issue two places
> * delete(path) (where it may make things worse)
> * rename(src, dest), where a check is made for dest != an empty directory.
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]