[
https://issues.apache.org/jira/browse/HADOOP-9984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13782684#comment-13782684
]
Colin Patrick McCabe commented on HADOOP-9984:
----------------------------------------------
* skip Windows in testCreateDanglingLink
* fix TOCTOU in RawLocalFileSystem where we could sometimes return an invalid
result if a directory was removed at the wrong time.
* fix Stat on BSD (thanks, Chris)
* DCRException#serialVersionUID should be private.
* capitalize error when throwing DCRException
* listStatusImpl -> listStatusInternal
* add DCRException to some throw specs that already throw IOE (does nothing,
but it serves as extra documentation)
bq. Rather than copy+pasting javadoc for similar methods, I like using See
@link with a small note about the differences (if any). Will help shrink this
patch.
OK. I am referencing all the {{listStatus}} implementations to the
{{listLinkStatus}} ones, with a note about the added exception and behavior.
bq. In the createSymlink javadoc, listLinkStatus should also be in the list of
functions that fully resolve
added
bq. Are we planning to add globLinkStatus type methods to
FileSystem/FileContext? Right now we have this resolveLinks which is always
true (and in TestGlobStatus too). It's a little confusing right now; I'd like
to either see the new APIs included here, or all of it broken out into a
separate JIRA
It's under discussion in HADOOP-9972. I think it will end up being a lot like
CreateOptions, but let's hold off on discussing that for now, since this JIRA
is already big enough.
bq. Rather than uriToSchemeAndAuthority, can we instead use Path#makeQualified
or FileSystem#makeQualified? If not, I also preferred the old style, since
using parameters as return values kinda bites.
We can't qualify a path pattern because it may involve things like
"\{a,/b\}/foo" where the different branches of the pattern have to be qualified
in different ways. The old style of two separate functions didn't work because
the decision about whether to use the default for scheme affects the decision
to use the default for authority. It could be inlined into the main body of
the glob function, but I'd prefer not to make it bigger.
bq. In WebHdfs and HttpFS, the Op / Operation is still called LISTSTATUS.
We can't change the name of that because it would break wire compatibility in
the HTTP request. Those enums get stringified. I don't think this will lead
to any confusion since the link-resolving version will be implemented on the
client side (i.e., it will not require another type of LIST RPC).
bq. Please use GenericTestUtils#assertExceptionContains in the new symlink
test, you can check for the right path in the exception message.
OK.
bq. [path comments]
Well, as you mentioned, the Path issues are clearly out of scope for this JIRA.
This is a tangent, but I am not convinced by the proposal that we return
built-up path everywhere. It would lead to a lot of unnecessary symlink
resolutions since we'd have to re-do all the work of resolution whenever we
used the paths. Plus, in the case of cross-filesystem links, it just doesn't
even make sense. What can you add to the end of an hdfs:// path that makes it
a file:// path? Nothing. Finally, from an implementation perspective, this
requires revisiting pretty much every FC or FS operation, since they all return
resolved path now.
The built-up path is information that programs can deduce themselves, in the
same way globStatus does when resolveLinks = false. (The resolveLinks = false
case is not exposed by an API yet, but it will be in HADOOP-9972)
> FileSystem#globStatus and FileSystem#listStatus should resolve symlinks by
> default
> ----------------------------------------------------------------------------------
>
> Key: HADOOP-9984
> URL: https://issues.apache.org/jira/browse/HADOOP-9984
> Project: Hadoop Common
> Issue Type: Bug
> Components: fs
> Affects Versions: 2.1.0-beta
> Reporter: Colin Patrick McCabe
> Assignee: Colin Patrick McCabe
> Priority: Blocker
> Attachments: HADOOP-9984.001.patch, HADOOP-9984.003.patch,
> HADOOP-9984.005.patch, HADOOP-9984.007.patch, HADOOP-9984.009.patch,
> HADOOP-9984.010.patch, HADOOP-9984.011.patch
>
>
> During the process of adding symlink support to FileSystem, we realized that
> many existing HDFS clients would be broken by listStatus and globStatus
> returning symlinks. One example is applications that assume that
> !FileStatus#isFile implies that the inode is a directory. As we discussed in
> HADOOP-9972 and HADOOP-9912, we should default these APIs to returning
> resolved paths.
--
This message was sent by Atlassian JIRA
(v6.1#6144)