[ https://issues.apache.org/jira/browse/HADOOP-9972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13770471#comment-13770471 ]
Binglin Chang commented on HADOOP-9972: --------------------------------------- Hi Colin, About globStatus example, if we follow linux practice, globStatus(p) = glob(pattern).map(path => getFileStatus(path)) String [] glob(pattern): if matches none, return pattern else return matched paths ignore all exceptions I did some experiments, you can see ls * indeed should error message, but ls */stuff should not show error message. {code} [root@master01 test]# mkdir -p aa/cc/foo [root@master01 test]# mkdir -p bb/cc/foo [root@master01 test]# chmod 700 bb [root@master01 test]# ll /home/serengeti/.bash [root@master01 test]# su serengeti [serengeti@master01 test]$ ll total 8 drwxr-xr-x 3 root root 4096 Sep 18 08:30 aa drwx------ 3 root root 4096 Sep 18 08:31 bb [serengeti@master01 test]$ ls * aa: cc ls: bb: Permission denied [serengeti@master01 test]$ ls */cc foo {code} Separate globStatus to glob and getFileStatus seems a more proper way of doing globStatus rather than add new classes/interface and callback handler, and this is linux practice, should be more robust. > new APIs for listStatus and globStatus to deal with symlinks > ------------------------------------------------------------ > > Key: HADOOP-9972 > URL: https://issues.apache.org/jira/browse/HADOOP-9972 > Project: Hadoop Common > Issue Type: Improvement > Components: fs > Affects Versions: 2.1.1-beta > Reporter: Colin Patrick McCabe > Assignee: Colin Patrick McCabe > > Based on the discussion in HADOOP-9912, we need new APIs for FileSystem to > deal with symlinks. The issue is that code has been written which is > incompatible with the existence of things which are not files or directories. > For example, > there is a lot of code out there that looks at FileStatus#isFile, and > if it returns false, assumes that what it is looking at is a > directory. In the case of a symlink, this assumption is incorrect. > It seems reasonable to make the default behavior of {{FileSystem#listStatus}} > and {{FileSystem#globStatus}} be fully resolving symlinks, and ignoring > dangling ones. This will prevent incompatibility with existing MR jobs and > other HDFS users. We should also add new versions of listStatus and > globStatus that allow new, symlink-aware code to deal with symlinks as > symlinks. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira