[ 
https://issues.apache.org/jira/browse/HADOOP-12169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14704179#comment-14704179
 ] 

Steve Loughran commented on HADOOP-12169:
-----------------------------------------

OK
# code patch looks good, though I think we could pull the 
{{f.makeQualified(uri, workingDir)}} out of the loop for a bit of a 
performance/memory use improvement
# the new test is something we should have in every FS test. How about we add 
an {{AbstractContractLsTest}} with the s3/s3n impls being the first ones, 
alongside file:// and hdfs://; others can be added later

> ListStatus on empty dir in S3A lists itself instead of returning an empty list
> ------------------------------------------------------------------------------
>
>                 Key: HADOOP-12169
>                 URL: https://issues.apache.org/jira/browse/HADOOP-12169
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 2.6.0, 2.7.0, 2.7.1
>            Reporter: Pieter Reuse
>            Assignee: Pieter Reuse
>         Attachments: HADOOP-12169-001.patch
>
>
> Upon testing the patch for HADOOP-11918, I stumbled upon a weird behaviour 
> this introduces to the S3AFileSystem-class. Calling ListStatus() on an empty 
> bucket returns an empty list, while doing the same on an empty directory, 
> returns an array of length 1 containing only this directory itself.
> The bugfix is quite simple. In the line of code {code}...if 
> (keyPath.equals(f)...{code} (S3AFileSystem:758), keyPath is qualified wrt. 
> the fs and f is not. Therefore, this returns false while it shouldn't. The 
> bugfix to make f qualified in this line of code.
> More formally: accoring to the formal definition of [The Hadoop FileSystem 
> API 
> Definition|https://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-common/filesystem/],
>  more specifically FileSystem.listStatus, only child elements of a directory 
> should be returned upon a listStatus()-call.
> In detail: 
> {code}
> elif isDir(FS, p): result [getFileStatus(c) for c in children(FS, p) where 
> f(c) == True]
> {code}
> and
> {code}
> def children(FS, p) = {q for q in paths(FS) where parent(q) == p}
> {code}
> Which translates to the result of listStatus on an empty directory being an 
> empty list. This is the same behaviour as ls has in Unix, which is what 
> someone would expect from a FileSystem.
> Note: it seemed appropriate to add the test of this patch to the same file as 
> the test for HADOOP-11918, but as a result, one of the two will have to be 
> rebased wrt. the other before being applied to trunk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to