[
https://issues.apache.org/jira/browse/HADOOP-9912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13753620#comment-13753620
]
Daryn Sharp commented on HADOOP-9912:
-------------------------------------
bq. The intended behavior of Globber.glob (which calls listStatus) is to return
symlink rather than symlink target I believe
bq. I guess for a long time, pig is using this behavior(listStatus return
symlink target rather than symlink), I am afraid this behavior is wrong and is
inconsistent with HDFS.
Wrong. Wrong. Wrong. {{listStatus}} resolves symlinks. {{globStatus}} is
supposed to be equivalent to {{listStatus}} with wildcard support. All
existing code depends on these semantics, and rightly so. Symlinks should be
transparent to users unless they specifically want to know if a path is a
symlink. That's why there is a counterpart to {{getFileStatus}} called
{{getFileLinkStatus}} which does not resolve symlinks.
HADOOP-9877 fundamentally broke the semantics of {{globStatus}} based on
whether the last path component is a glob or static. The result is:
* /path/symlink - the static component "symlink" results in a file status of
the symlink, breaking isFile/isDir/etc
* /path/sym*link - the glob component "symlink" returns the file status of the
resolved link, working as expected
{{globStatus}} _must_ consistently return resolved paths. The semantics
altered by HADOOP-9877 will break lots of code. I'm pretty sure that includes
{{FsShell}}. We cannot break lot standing semantics just for snapshots.
Why does .snapshot support require a {{getFileLinkStatus}}? Does
{{getFileStatus}} not work for a .snapshot directory?
> globStatus of a symlink to a directory does not report symlink as a directory
> -----------------------------------------------------------------------------
>
> Key: HADOOP-9912
> URL: https://issues.apache.org/jira/browse/HADOOP-9912
> Project: Hadoop Common
> Issue Type: Bug
> Components: fs
> Affects Versions: 2.3.0
> Reporter: Jason Lowe
> Priority: Blocker
> Attachments: HADOOP-9912-testcase.patch
>
>
> globStatus for a path that is a symlink to a directory used to report the
> resulting FileStatus as a directory but recently this has changed.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira