[ 
https://issues.apache.org/jira/browse/HADOOP-9984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14546242#comment-14546242
 ] 

Jason Lowe commented on HADOOP-9984:
------------------------------------

bq. In the case of globStatus, things are even worse if you choose to resolve 
symlinks, since then you can glob for '*foo' and get back 'bar'. A lot of 
software breaks if globs return back file names that the glob doesn't match.

As I understand it, globStatus is simply listStatus with filtering applied to 
the results.  If that's the case then globStatus should do whatever listStatus 
does with respect to symlinks, and that would be to resolve the symlink 
_except_ for the path in the resulting FileStatus.  This goes back to the 
readdir() + stat() analogy -- everything in the resulting FileStatus needs to 
be about where the symlink points _except_ the path.  The path would still be 
the path to the link, since that's what readdir() would see as well.  Every 
other field in FileStatus has to do with what stat() would return, so those 
fields should be reflective of what the symlink references.  So globStatus 
should not lead to surprises where "foo*" returns "bar" even in the presence of 
symlinks.

> FileSystem#globStatus and FileSystem#listStatus should resolve symlinks by 
> default
> ----------------------------------------------------------------------------------
>
>                 Key: HADOOP-9984
>                 URL: https://issues.apache.org/jira/browse/HADOOP-9984
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs
>    Affects Versions: 2.1.0-beta
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>            Priority: Critical
>              Labels: BB2015-05-TBR
>         Attachments: HADOOP-9984.001.patch, HADOOP-9984.003.patch, 
> HADOOP-9984.005.patch, HADOOP-9984.007.patch, HADOOP-9984.009.patch, 
> HADOOP-9984.010.patch, HADOOP-9984.011.patch, HADOOP-9984.012.patch, 
> HADOOP-9984.013.patch, HADOOP-9984.014.patch, HADOOP-9984.015.patch
>
>
> During the process of adding symlink support to FileSystem, we realized that 
> many existing HDFS clients would be broken by listStatus and globStatus 
> returning symlinks.  One example is applications that assume that 
> !FileStatus#isFile implies that the inode is a directory.  As we discussed in 
> HADOOP-9972 and HADOOP-9912, we should default these APIs to returning 
> resolved paths.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to