[
https://issues.apache.org/jira/browse/HADOOP-9984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14519971#comment-14519971
]
Sanjay Radia commented on HADOOP-9984:
--------------------------------------
bq. The problem with dereferencing all symlinks in listStatus is that it's
disastrously inefficient
# In the proposal listStatus2 is the new API that replaces listStatus
# all our libraries need to be changed to use listStatus2 (see item 3 in the4
proposal)
# customer who have old code that calls the old listStatus and cannot convert
that code immediately can disable symlinks, not use symlinks, or use symlinks
sparinglg. In practice I don't think there will dirs with oven tens of symlinks
(but symlink2 addresses the problem going forward.
bq. isSymlink is broken for dangling symlinks, FileSystem#rename is broken for
symlinks, the behavior of symlinks in globStatus is controversial, distCp
doesn't support it, ...
These are fixable. I think this jira itslef was attempting to fix some of these
when we ran into the design flaw of the orignal listStatus
bq. cross-filesystem symlinks ...
As I pointed out this needs to be discussed. Let make a separate comment that
summarizes the cross-namspace issues that have been presented in the various
comments in this and other jiras.
> FileSystem#globStatus and FileSystem#listStatus should resolve symlinks by
> default
> ----------------------------------------------------------------------------------
>
> Key: HADOOP-9984
> URL: https://issues.apache.org/jira/browse/HADOOP-9984
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs
> Affects Versions: 2.1.0-beta
> Reporter: Colin Patrick McCabe
> Assignee: Colin Patrick McCabe
> Priority: Critical
> Attachments: HADOOP-9984.001.patch, HADOOP-9984.003.patch,
> HADOOP-9984.005.patch, HADOOP-9984.007.patch, HADOOP-9984.009.patch,
> HADOOP-9984.010.patch, HADOOP-9984.011.patch, HADOOP-9984.012.patch,
> HADOOP-9984.013.patch, HADOOP-9984.014.patch, HADOOP-9984.015.patch
>
>
> During the process of adding symlink support to FileSystem, we realized that
> many existing HDFS clients would be broken by listStatus and globStatus
> returning symlinks. One example is applications that assume that
> !FileStatus#isFile implies that the inode is a directory. As we discussed in
> HADOOP-9972 and HADOOP-9912, we should default these APIs to returning
> resolved paths.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)