[ 
https://issues.apache.org/jira/browse/HADOOP-9984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13782684#comment-13782684
 ] 

Colin Patrick McCabe commented on HADOOP-9984:
----------------------------------------------

* skip Windows in testCreateDanglingLink

* fix TOCTOU in RawLocalFileSystem where we could sometimes return an invalid 
result if a directory was removed at the wrong time.

* fix Stat on BSD (thanks, Chris)

* DCRException#serialVersionUID should be private.

* capitalize error when throwing DCRException

* listStatusImpl -> listStatusInternal

* add DCRException to some throw specs that already throw IOE (does nothing, 
but it serves as extra documentation)

bq. Rather than copy+pasting javadoc for similar methods, I like using See 
@link with a small note about the differences (if any). Will help shrink this 
patch.

OK.  I am referencing all the {{listStatus}} implementations to the 
{{listLinkStatus}} ones, with a note about the added exception and behavior.

bq. In the createSymlink javadoc, listLinkStatus should also be in the list of 
functions that fully resolve

added

bq. Are we planning to add globLinkStatus type methods to 
FileSystem/FileContext? Right now we have this resolveLinks which is always 
true (and in TestGlobStatus too). It's a little confusing right now; I'd like 
to either see the new APIs included here, or all of it broken out into a 
separate JIRA

It's under discussion in HADOOP-9972.  I think it will end up being a lot like 
CreateOptions, but let's hold off on discussing that for now, since this JIRA 
is already big enough.

bq. Rather than uriToSchemeAndAuthority, can we instead use Path#makeQualified 
or FileSystem#makeQualified? If not, I also preferred the old style, since 
using parameters as return values kinda bites.

We can't qualify a path pattern because it may involve things like 
"\{a,/b\}/foo" where the different branches of the pattern have to be qualified 
in different ways.  The old style of two separate functions didn't work because 
the decision about whether to use the default for scheme affects the decision 
to use the default for authority.  It could be inlined into the main body of 
the glob function, but I'd prefer not to make it bigger.

bq. In WebHdfs and HttpFS, the Op / Operation is still called LISTSTATUS.

We can't change the name of that because it would break wire compatibility in 
the HTTP request.  Those enums get stringified.  I don't think this will lead 
to any confusion since the link-resolving version will be implemented on the 
client side (i.e., it will not require another type of LIST RPC).

bq. Please use GenericTestUtils#assertExceptionContains in the new symlink 
test, you can check for the right path in the exception message.

OK.

bq. [path comments]

Well, as you mentioned, the Path issues are clearly out of scope for this JIRA.

This is a tangent, but I am not convinced by the proposal that we return 
built-up path everywhere.  It would lead to a lot of unnecessary symlink 
resolutions since we'd have to re-do all the work of resolution whenever we 
used the paths.  Plus, in the case of cross-filesystem links, it just doesn't 
even make sense.  What can you add to the end of an hdfs:// path that makes it 
a file:// path?  Nothing.  Finally, from an implementation perspective, this 
requires revisiting pretty much every FC or FS operation, since they all return 
resolved path now.

The built-up path is information that programs can deduce themselves, in the 
same way globStatus does when resolveLinks = false.  (The resolveLinks = false 
case is not exposed by an API yet, but it will be in HADOOP-9972)

> FileSystem#globStatus and FileSystem#listStatus should resolve symlinks by 
> default
> ----------------------------------------------------------------------------------
>
>                 Key: HADOOP-9984
>                 URL: https://issues.apache.org/jira/browse/HADOOP-9984
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 2.1.0-beta
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>            Priority: Blocker
>         Attachments: HADOOP-9984.001.patch, HADOOP-9984.003.patch, 
> HADOOP-9984.005.patch, HADOOP-9984.007.patch, HADOOP-9984.009.patch, 
> HADOOP-9984.010.patch, HADOOP-9984.011.patch
>
>
> During the process of adding symlink support to FileSystem, we realized that 
> many existing HDFS clients would be broken by listStatus and globStatus 
> returning symlinks.  One example is applications that assume that 
> !FileStatus#isFile implies that the inode is a directory.  As we discussed in 
> HADOOP-9972 and HADOOP-9912, we should default these APIs to returning 
> resolved paths.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to