[
https://issues.apache.org/jira/browse/HDFS-245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Eli Collins updated HDFS-245:
-----------------------------
Attachment: designdocv2.txt
While writing tests I noticed the current API doesn't match POSIX semantics
that closely, eg if a path refers to a symlink then the symlink is not
resolved, eg {{getFileStatus}} returns the FileStatus of the link rather than
what it points to (ie behaves like {{lstat}} rather than {{stat}}), ditto for
{{open}}, {{setReplication}} etc. While some APIs should act on the symlink
itself (eg {{rename}}, {{delete}}) others need symlinks fully resolved. The
design doc should specify the intended behavior of the FileContext API wrt
symlinks. I attached an updated version and pasted the relevant section below.
What do people think?
h2. FileContext APIs
This section specifies the behavior of the FileContext API when links are
present in paths. The intent is to match POSIX semantics. For most functions,
if symlinks are supported, all links leading up to the target of a path should
automatically be resolved. Some functions will not resolve any links in a given
path. Some functions will, if given a path that refers to a symlink, operate on
the target of the symlink, while others will operate on the symlink itself. For
example, {{setReplication}} and {{getFileBlockLocations}} act on the symlink
target while {{delete}} and {{getFileStatus}} act on the symlink itself.
Behavior is specified both for filesystems that do and do not support symlinks.
To support symlink-aware utilities the FileContext API requires some new
interfaces (eg equivalent to {{lstat}}) to indicate whether a path refers to a
symlink.
- {{create}}, {{mkdir}} -- the path should not refer to a symlink since the
path must not currently exist.
- {{delete}}, {{deleteOnExit}} -- if path refers to a symlink then the symlink
is removed (like {{unlink}}).
- {{open}} -- if the given path refers to a symlink then the path is fully
resolved.
- {{set|getWorkingDirectory}} -- if the given path refers to a symlink then the
symlink is fully resolved when setting the working directory, ie if the working
directory is changed to {{/link1/link2}} then subsequent queries of the working
directory should return whatever {{link2}} points to.
- {{setReplication}} -- if the given path refers to a symlink then the path is
fully resolved.
- {{setPermission}} -- if the given path refers to a symlink then the path is
fully resolved (like {{chmod}}). Symlink access is determined by permissions of
the target of the symlink.
- {{setOwner}} -- if the given path refers to a symlink then the path is fully
resolved (like {{chown}}). We could add an {{lchown}} equivalent in the future.
- {{setTimes}} -- if the given path refers to a symlink then the path is fully
resolved. ySmlinks do not have access times.
- {{get|setFileChecksum}} -- if the given path refers to a symlink then the
path is fully resolved, ie there are no checksums associated with symlinks.
- {{getFileStatus}} -- if the given path refers to a symlink then the path is
fully resolved, ie returns the FileStatus of the file or directory the symlink
points to.
- *new* {{getLinkFileStatus}} -- like {{lstat}}, if the given path refers to a
symlink then the FileStatus of the symlink is returned, otherwise the results
as if {{getFileStatus}} was called. If symlink support is not enabled or the
underlying filesystem does not support symlinks then the results are the same
as if {{getFileStatus}} was called.
- {{isDirectory}}, {{isFile}} -- if the given path refers to a symlink then the
path is fully resolved, ie if the symlink points to a directory then
{{isDirectory}} returns true.
- *new* {{isLink}} -- returns true if the given path refers to a symlink. If
symlink support is not enabled or the underlying filesystem does not support
symlinks then {{isLink}} returns false.
- {{listStatus}} -- if the given path refers to a symlink then the path is
fully resolved, ie the result is equivalent to calling {{listStatus}} with the
target of the symlink.
- {{getFileBlockLocations}} -- if the given path refers to a symlink then the
path is fully resolved, ie symlinks are not associated with blocks.
- {{getFsStatus}} -- if the given path refers to a symlink then the path is
fully resolved, ie the FsStatus of the target of the symlink is returned.
- {{getLinkTarget}} -- only the first symlink in the given path is resolved. If
symlink support is not enabled or the underlying filesystem does not support
symlinks then an IOException is thrown.
- {{resolve}} -- all symlinks in the given path are resolved. If symlink
support is not enabled or the underlying filesystem does not support symlinks
then no symlinks are resolved.
- {{createSymlink(oldpath, newpath)}}
-- newpath should not refer to a symlink since the path must not currently
exist.
-- _No symlinks are resolved in oldpath_. For example, if {{/link1}} points
to {{/dir}}, and {{/link1/link2}} points to {{/link1/file}}, then
{{createSymlink("/link1/file", "/link1/link2")}} points {{link2}} to
{{/link1/file}} (not {{/dir/file}}). The path {{/link1/link2}} resolves as
follows: {{/dir/link2}} -> {{/link1/file}} -> {{/dir/file}}.
-- If symlink support is not enabled or the underlying filesystem does not
support symlinks then an IOException is thrown.
- {{rename(oldpath, newpath)}} --
-- if oldpath refers to a symlink, the symlink is renamed (POSIX)
-- if newpath refers to a symlink, the symlink is over-written (POSIX), if
the the OVERWRITE option is passed.
> Create symbolic links in HDFS
> -----------------------------
>
> Key: HDFS-245
> URL: https://issues.apache.org/jira/browse/HDFS-245
> Project: Hadoop HDFS
> Issue Type: New Feature
> Reporter: dhruba borthakur
> Assignee: dhruba borthakur
> Attachments: 4044_20081030spi.java, designdocv1.txt, designdocv2.txt,
> HADOOP-4044-strawman.patch, symlink-0.20.0.patch, symLink1.patch,
> symLink1.patch, symLink11.patch, symLink12.patch, symLink13.patch,
> symLink14.patch, symLink15.txt, symLink15.txt, symlink16-common.patch,
> symlink16-hdfs.patch, symlink16-mr.patch, symlink17-common.txt,
> symlink17-hdfs.txt, symlink18-common.txt, symlink19-common.txt,
> symlink19-common.txt, symlink19-hdfs.txt, symLink4.patch, symLink5.patch,
> symLink6.patch, symLink8.patch, symLink9.patch
>
>
> HDFS should support symbolic links. A symbolic link is a special type of file
> that contains a reference to another file or directory in the form of an
> absolute or relative path and that affects pathname resolution. Programs
> which read or write to files named by a symbolic link will behave as if
> operating directly on the target file. However, archiving utilities can
> handle symbolic links specially and manipulate them directly.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.