[ https://issues.apache.org/jira/browse/HADOOP-9972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13773471#comment-13773471 ]
Colin Patrick McCabe commented on HADOOP-9972: ---------------------------------------------- bq.Just to be clear, what happens if the error handler does not rethrow the exception? If the error handler doesn't rethrow the exception, the listStatus / globStatus operation continues normally and returns the remaining results. (We can't return the result that had the error.) Unresolved symlinks are one type of error. Whether to handle {{UnresolvedLinkException}} differently than other exceptions is up to the {{PathErrorHandler}} you provide. bq. I'm not sure if the difference between "log exception and continue" vs. "ignore it completely" is a different return code from the error handler method or just whether the handler logs or not. I was proposing that the logging happen inside the {{PathErrorHandler}}. From the point of file of FileSystem / FileContext, all we care about is whether the {{PathErrorHandler}} rethrows the exception or not. (We can provide a class implementing PathErrorHandler that logs to FileSystem#LOG if that is a common use case.) bq. I suppose one could derive a new interface from PathFilter that becomes PathOptions and listStatus(Path, PathFilter) could check internally if it's actually got a PathOption instead of a PathFilter and behave differently. However I think an explicit, separate API would be preferable though, simply for clarity of what the API expects from callers. Yeah, I was proposing adding a new type, {{PathOptions}}, which could contain an instance of {{PathFilter}}. We could add new methods to {{PathFilter}}, but since it's a public/stable interface rather than an abstract class, that would be an incompatible change. > new APIs for listStatus and globStatus to deal with symlinks > ------------------------------------------------------------ > > Key: HADOOP-9972 > URL: https://issues.apache.org/jira/browse/HADOOP-9972 > Project: Hadoop Common > Issue Type: Improvement > Components: fs > Affects Versions: 2.1.1-beta > Reporter: Colin Patrick McCabe > Assignee: Colin Patrick McCabe > > Based on the discussion in HADOOP-9912, we need new APIs for FileSystem to > deal with symlinks. The issue is that code has been written which is > incompatible with the existence of things which are not files or directories. > For example, > there is a lot of code out there that looks at FileStatus#isFile, and > if it returns false, assumes that what it is looking at is a > directory. In the case of a symlink, this assumption is incorrect. > It seems reasonable to make the default behavior of {{FileSystem#listStatus}} > and {{FileSystem#globStatus}} be fully resolving symlinks, and ignoring > dangling ones. This will prevent incompatibility with existing MR jobs and > other HDFS users. We should also add new versions of listStatus and > globStatus that allow new, symlink-aware code to deal with symlinks as > symlinks. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira