[ 
https://issues.apache.org/jira/browse/HADOOP-9972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13770471#comment-13770471
 ] 

Binglin Chang commented on HADOOP-9972:
---------------------------------------

Hi Colin, 
About globStatus example, if we follow linux practice, globStatus(p) = 
glob(pattern).map(path => getFileStatus(path))
String [] glob(pattern):
  if matches none, return pattern
  else return matched paths
  ignore all exceptions

I did some experiments, you can see ls * indeed should error message, but ls 
*/stuff should not show error message.
{code}
[root@master01 test]# mkdir -p aa/cc/foo
[root@master01 test]# mkdir -p bb/cc/foo
[root@master01 test]# chmod 700 bb
[root@master01 test]# ll /home/serengeti/.bash
[root@master01 test]# su serengeti
[serengeti@master01 test]$ ll
total 8
drwxr-xr-x 3 root root 4096 Sep 18 08:30 aa
drwx------ 3 root root 4096 Sep 18 08:31 bb
[serengeti@master01 test]$ ls *
aa:
cc
ls: bb: Permission denied
[serengeti@master01 test]$ ls */cc
foo
{code}

Separate globStatus to glob and getFileStatus seems a more proper way of doing 
globStatus rather than add new classes/interface and callback handler, and this 
is linux practice, should be more robust.






                
> new APIs for listStatus and globStatus to deal with symlinks
> ------------------------------------------------------------
>
>                 Key: HADOOP-9972
>                 URL: https://issues.apache.org/jira/browse/HADOOP-9972
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: fs
>    Affects Versions: 2.1.1-beta
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>
> Based on the discussion in HADOOP-9912, we need new APIs for FileSystem to 
> deal with symlinks.  The issue is that code has been written which is 
> incompatible with the existence of things which are not files or directories. 
>  For example,
> there is a lot of code out there that looks at FileStatus#isFile, and
> if it returns false, assumes that what it is looking at is a
> directory.  In the case of a symlink, this assumption is incorrect.
> It seems reasonable to make the default behavior of {{FileSystem#listStatus}} 
> and {{FileSystem#globStatus}} be fully resolving symlinks, and ignoring 
> dangling ones.  This will prevent incompatibility with existing MR jobs and 
> other HDFS users.  We should also add new versions of listStatus and 
> globStatus that allow new, symlink-aware code to deal with symlinks as 
> symlinks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to