[ 
https://issues.apache.org/jira/browse/HADOOP-4044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12633842#action_12633842
 ] 

Doug Cutting commented on HADOOP-4044:
--------------------------------------

Some comments:
 - I don't like the 'vfs' package and naming.  Symbolic links should not be a 
distinguished portion of the FileSystem API, but seamlessly integrated.  So I 
suggest that vfs/VfsStatus be renamed to FSLinkable, vfs/VfsStatusBase to 
FSLink, vfs/VfsStatusBoolean to FSLinkBoolean, vfs/VfsStatusFileStatus to 
LinkableFileStatus, etc.  If these are to be in a separate package, it might be 
called 'fs/spi', since they are primarily needed only by implementors of the 
FileSystem API, not by users.  The protected implementation methods should be 
called openImpl(), appendImpl(), etc.
 - getLink() should return a Path, not a String.
 - getLink() should throw an exception when isLink() is false.
 - The check for link cycles is wrong.  If the loop starts after the first link 
traversed it will not be detected.  A common approach is simply to limit the 
number of links traversed to a constant.  Alternately you can keep a 'fast' and 
'slow' pointer, incrementing the fast pointer through the list twice as fast as 
the slow.  If they are ever equal then there's a loop.  This will detect all 
loops.
 - I don't see the need for both getLink() and getRemainingPath().  Wouldn't it 
be simpler to always have getLink() return a fully-qualified path?  Internally 
a FileSystem might support relative paths, but why do we need to expose these?

Instead of repeating the link resolving loop in every method, we might use a 
"closure", e.g:
{code}
public FSInputStream open(Path p, final int bufferSize) throws IOException {
  return resolve(path, new FSLinkResolver<FSInputStream>() {
    FSInputStream next(Path p) throws IOException { return openImpl(p, 
bufferSize); }
};
{code}
where FSLinkResolver#resolve implements the loop-detection algorithm, calling 
#next to traverse the list.


> Create symbolic links in HDFS
> -----------------------------
>
>                 Key: HADOOP-4044
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4044
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: symLink1.patch, symLink1.patch, symLink4.patch, 
> symLink5.patch
>
>
> HDFS should support symbolic links. A symbolic link is a special type of file 
> that contains a reference to another file or directory in the form of an 
> absolute or relative path and that affects pathname resolution. Programs 
> which read or write to files named by a symbolic link will behave as if 
> operating directly on the target file. However, archiving utilities can 
> handle symbolic links specially and manipulate them directly.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to