[
https://issues.apache.org/jira/browse/HADOOP-4044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12633842#action_12633842
]
Doug Cutting commented on HADOOP-4044:
--------------------------------------
Some comments:
- I don't like the 'vfs' package and naming. Symbolic links should not be a
distinguished portion of the FileSystem API, but seamlessly integrated. So I
suggest that vfs/VfsStatus be renamed to FSLinkable, vfs/VfsStatusBase to
FSLink, vfs/VfsStatusBoolean to FSLinkBoolean, vfs/VfsStatusFileStatus to
LinkableFileStatus, etc. If these are to be in a separate package, it might be
called 'fs/spi', since they are primarily needed only by implementors of the
FileSystem API, not by users. The protected implementation methods should be
called openImpl(), appendImpl(), etc.
- getLink() should return a Path, not a String.
- getLink() should throw an exception when isLink() is false.
- The check for link cycles is wrong. If the loop starts after the first link
traversed it will not be detected. A common approach is simply to limit the
number of links traversed to a constant. Alternately you can keep a 'fast' and
'slow' pointer, incrementing the fast pointer through the list twice as fast as
the slow. If they are ever equal then there's a loop. This will detect all
loops.
- I don't see the need for both getLink() and getRemainingPath(). Wouldn't it
be simpler to always have getLink() return a fully-qualified path? Internally
a FileSystem might support relative paths, but why do we need to expose these?
Instead of repeating the link resolving loop in every method, we might use a
"closure", e.g:
{code}
public FSInputStream open(Path p, final int bufferSize) throws IOException {
return resolve(path, new FSLinkResolver<FSInputStream>() {
FSInputStream next(Path p) throws IOException { return openImpl(p,
bufferSize); }
};
{code}
where FSLinkResolver#resolve implements the loop-detection algorithm, calling
#next to traverse the list.
> Create symbolic links in HDFS
> -----------------------------
>
> Key: HADOOP-4044
> URL: https://issues.apache.org/jira/browse/HADOOP-4044
> Project: Hadoop Core
> Issue Type: New Feature
> Components: dfs
> Reporter: dhruba borthakur
> Assignee: dhruba borthakur
> Attachments: symLink1.patch, symLink1.patch, symLink4.patch,
> symLink5.patch
>
>
> HDFS should support symbolic links. A symbolic link is a special type of file
> that contains a reference to another file or directory in the form of an
> absolute or relative path and that affects pathname resolution. Programs
> which read or write to files named by a symbolic link will behave as if
> operating directly on the target file. However, archiving utilities can
> handle symbolic links specially and manipulate them directly.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.