[ 
https://issues.apache.org/jira/browse/HADOOP-9972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13777020#comment-13777020
 ] 

Colin Patrick McCabe commented on HADOOP-9972:
----------------------------------------------

I thought about this for a while, and I think adding a {{listLinkStatus}} 
function as you suggest might be a good idea.  It's kind of similar to 
{{getFileLinkStatus}}.  The nice thing about this approach is that filesystems 
which don't yet support symlinks can go back to the default approach of 
resolving all links (the same as getFileStatus).

bq. Are you saying network issue can cause unresolved symlink error?

Yes.  Symlinks can be cross-filesystem, and if one filesystem is unreachable, 
that would be a network error.

For globStatus, I'd like to do something similar to {{FileContext#create}}, 
where you have a varargs argument with {{CreateOpts}}.

One nice thing is that {{FileSystem#globStatus}} is not implemented by 
subclasses like {{listStatus}} is.
                
> new APIs for listStatus and globStatus to deal with symlinks
> ------------------------------------------------------------
>
>                 Key: HADOOP-9972
>                 URL: https://issues.apache.org/jira/browse/HADOOP-9972
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: fs
>    Affects Versions: 2.1.1-beta
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>
> Based on the discussion in HADOOP-9912, we need new APIs for FileSystem to 
> deal with symlinks.  The issue is that code has been written which is 
> incompatible with the existence of things which are not files or directories. 
>  For example,
> there is a lot of code out there that looks at FileStatus#isFile, and
> if it returns false, assumes that what it is looking at is a
> directory.  In the case of a symlink, this assumption is incorrect.
> It seems reasonable to make the default behavior of {{FileSystem#listStatus}} 
> and {{FileSystem#globStatus}} be fully resolving symlinks, and ignoring 
> dangling ones.  This will prevent incompatibility with existing MR jobs and 
> other HDFS users.  We should also add new versions of listStatus and 
> globStatus that allow new, symlink-aware code to deal with symlinks as 
> symlinks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to