Issue7+TC2 0001457]: Add readlink(1) utility

Robert Elz via austin-group-l at The Open Group Sat, 26 Feb 2022 10:17:39 -0800

    Date:        Sat, 26 Feb 2022 03:07:15 +0000 (UTC)
    From:        Thorsten Glaser <tg...@mirbsd.org>
    Message-ID:  <pine.bsm.4.64l.2202260301410.24...@herc.mirbsd.org>


  | No, because the trailing /name *may* exist (and be a symlink).

Yes, I had forgotten that case, but that's just a simple modification of
the sample code I provided...

        if test -e "${var}" || test -h "${var}"; then
            newpath=$(realpath "${var}")
(etc).

That is, unless the point of "not existing" is that (with -e) there
is intended to be an error if final component of the path is a symlink
that doesn't resolve to an existing file, as distinct from the final
component not existing at all.    None of the existing documentation
seems very clear to me, doing a compatible implementation, without
copying source, seems likely to be difficult.

  | >But I currently don't understand the point of canonicalising a
  | >path that doesn't exist (why not create it first, then get the
  | >canonical form?
  |
  | Output filenames.
  |
  | outf=$(realpath "$1")
  | dosomething >"$outf"

And
        dosomething >"$1"

won't work (in this precise scenario) for some reason?   What's that?

  | >If it isn't to be created, and doesn't exist,
  | >then why does anyone care what its canonical name might be?)
  |
  | set -- ./filename
  | outf=$(realpath "$1")
  | cd "$(realpath "$0/..")"    # or mktemp -d or something…
  | dosomething >"$outf"

OK, dealing with pathnames relative to someplace other than where
you're going to be, that makes sense, but

        case "$1" in
        '')     # whatever you want to do in that case ;;
        /*)     outf=$1;;
        *)      outf=$PWD/$1;;
        esac

generally works for me, and doesn't rely upon what is currently a
non-standard utility.   [Aside: I know the comment in the above is bad syntax]

For the cd, just
        cd $0/..
(which seems unlikely to be what you'd want, since $0 isn't usually a
directory - that is unless you believe in "logical" paths, and have the
"/.." suffix simply perform string manipulation, which is something I
detest.   If I want string manipulation, I use sed, that's what it is for.

coreutils has -L (or -s) that would make this work, without it, it should fail
I believe, though once again, the documentation isn't all that clear.

Since realpath(1) is not in POSIX (yet) we cannot consult that, but we
do have realpath(3) there, and of that one of the possible errors is:

[ENOTDIR]  A component of the path prefix names an existing file that is
           neither a directory nor a symbolic link to a directory, or the
           file_name argument contains at least one non-<slash> character
           and ends with one or more trailing <slash> characters and the
           last pathname component names an existing file that is neither
           a directory nor a symbolic link to a directory.

For present purposes we can ignore everything after the 2nd line there
(that alternate reason for ENOTDIR doesn't apply here) but assuming that
$0 might often be "/bin/sh" then $0/.. is "/bin/sh/.." in which "A component
of the path prefix" (ie: /bin/sh in this case) "names an existing file" (one
hopes that /bin/sh is an existing file) "that is neither a directory nor
a symbolic link to a directory" (and one assumes that /bin/sh isn't any
kind of directory).   Hence you should simply get ENOTDIR from $0/..

A relevant question here, is given that we have

        mkdir /tmp/foo
        ln -s /tmp/foo sym
        >/tmp/bar
        >bar

then does
        realpath sym/../bar
return ./bar (or $PWD/bar more likely), or does it return /tmp/bar ?
I believe /tmp/bar is correct (and that is indeed what mksh seems to
generate.)

But on netbsd ..

netbsd# realpath /bin/sh/..
realpath: /bin/sh/..: Not a directory

whereas for mksh

mksh $ realpath /bin/sh/..
/bin

which seems incorrect to me.   But as realpath(1) isn't yet
standardised, who knows - but it is hard to see that happening in
a way that makes it incompatible with realpath(3).

  | >That is, the plan would be to perhaps add the -e option
  |  (and maybe -q, and perhaps a newly invented -E),
  |
  | Why?

Because -q is the one common option currently, and -e to coreutils
realpath gives when BSD realpath currently does.   The -E proposal
is to allow BSD realpath to act like the coreutils version, without
sacrificing backwards compat with what exists (but that might not be
needed, I am not sure if we care that much about this ... just do not
know yet.)

  | Instead of -e, you can just do a subsequent test -e on the file.

Perhaps, but with -e the internal implementation is much simpler,
seems odd to do all the work of dealing with coping with a non-existant
file when the code that follows is just going to fail in that case
anyway.   Simpler for everyone to just let realpath(1) fail.

  | Instead of -q, you can 2>/dev/null.

No, that suppresses everything to stderr, including usage messages, etc.
The -q option just suppresses sys call errors (like files not existing).

  | >and almost certainly to allow multiple file args
  |
  | No, absolutely NOT that. That's going to reopen the debate about
  | output separators (and only safe if NUL). And none of the existing
  | implementations has -0/-z/-Z.

coreutils has -z (according to its man page anyway).   And who cares
about the debate?  Like everything else which can deal with multiple
files and perhaps produce errors, if you're in a situation where it
matters, you only use 1 arg - and here you know that 1 arg means at
most 1 path returned (and that any error returned belongs to that sole arg).

That it is often sane to use a utility with just one arg, doesn't mean
that it must be restricted to only work that way.

  | I *did* think about this when adding realpath to mksh, and I decided
  | for handling exactly one argument precisely to avoid this hellhole.

Other versions of realpath handle multiple args, and you might
remember this from earlier in this discussion:

e...@google.com said:
  | one thing i haven't seen mentioned so far (but which i added to toybox
  | myself, so i know it's definitely in use) is that existing realpath
  | implementations support *multiple* file arguments on the command line, not
  | just one.

This one I suspect is beyond discussion, readlink and realpath both
accept multiple file args.

kre

Re: [1003.1(2016/18)/Issue7+TC2 0001457]: Add readlink(1) utility

Reply via email to