(many months pass...)
I've recently been doing something a bit related:
While I've been refitting my old Linux 9P server, I decided to altered it to
use openat(2), and the other fd-relative *at calls. It used to keep an absolute
pathname string with each FID, so would start a file system walk from the root
directory on every access. If directories were moved or renamed, things broke.
But openat can help with that:
On Linux,
newfd = openat (oldfd, relpath, O_PATH)
effectively does a Twalk. newfd is not an open file, it's a stable reference to
the file itself - much like the current directory in a process. Or, indeed,
like a FID. If it's a directory, you can then open it with
dirfd = openat (fd, ".", OREAD);
but if it's a file you can't! You can stat an fd obtained with O_PATH like this:
fstatat (fd, "", &stat, AT_EMPTY_PATH);
but that doesn't work for opening files:
openat (fd, "", OREAD)
doesn't work. There's no O_EMPTYPATH flag for open.
Fortunately:
open ("/proc/self/fd/<fd#>", OREAD)
*d****oes* work. This effectively implements Topen.
Indeed, as hinted on the man page, the /proc/self/fd virtual directory on Linux
seems to be able to do pretty much everything the openat calls do, without
requiring the openat calls - you can use O_PATH with plain open(2). This is
somewhat analogous to the devdup (#d) ideas that were discussed here last April.
There are two things going on here: O_PATH gives you a way to get a file
descriptor for a file without actually opening it, and openat (or
/proc/self/fd) lets you start your walk from any directory rather than just .
or /.
To use this I added a data structure somewhat like the one exportfs uses: each
fid points at the leaf of a ref-counted tree of path elements. As my server
walks the path, it now keeps an O_PATH file descriptor at every step, in
addition to the element name, so it can maintains stable references and still
"get dot dot right". v9fs (the Linux 9P client in the kernel) keeps a trail of
FIDs like this too, so it doesn't need to Twalk "..", but I think the Plan9
kernel (devmnt?) keeps just one FID for a cwd, so has to Twalk ".."
Tremove (and Twstat with a new name) are difficult (impossible?) to implement
properly on Linux, because Plan 9 files have a single name, whereas on Linux
they can have several names (hard links). Probably the best we can do is to
remember the name used to get to a file so we have an old name to give to
Linux. Renameat unfortunately doesn't make it possible to refer to a file to
rename by an O_PATH reference - and indeed in the presence of hard links, it
wouldn't identify which link to change - unless it remembered how it got there.
I've not been able to think of a way to implement Tremove without either a race
condition, or risking removing the wrong file. The Linux API (or my Linux-fu)
seems to fall just short of making this possible.
For my 9P server it's now at least only the file to be renamed or removed that
there's a problem here; if an ancestor directory is moved it remains stable.
So I suppose I almost found a use for openat(2)... :)
Glibc on Linux converts open(2) calls into openat(2) calls, but it's still
possible to call open(2) using the syscall(2) mechanism. Linux has
open_by_handle_at which is similar to the freebsd fhopen, apparently. I think
they're there mostly for NFS. I haven't really explored whether they have a
role to play here.
------------------------------------------
9fans: 9fans
Permalink:
https://9fans.topicbox.com/groups/9fans/T675e737e776e5a9c-Mdb5ecf364ba9f0735082d1bc
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription