A couple days ago it came to my attention that the hack used for
cloning devices (returning a magic error code and stuffing a file
descriptor number in a magic field of struct lwp) is also used by
/dev/stderr, and more importantly, every caller of vn_open in the
kernel (there are many) needs to know about this hack to avoid
returning internal negative errnos to users. Needless to say, none of
them (besides do_open) actually do, plus this is gross.
The following patch does not clean out the hack, but establishes a
containment perimeter inside vn_open; vn_open now returns either a
vnode or a file descriptor number. (And, along with the file
descriptor number, a flag saying whether it was the EDUPFD or EMOVEFD
case.) Callers of vn_open not prepared to deal with file descriptors,
which is all of them besides do_open, can pass NULL for the file
descriptor return parameter, in which case vn_open will instead return
EOPNOTSUPP.
It also takes advantage of having rearranged vn_open's signature to
stop exposing struct nameidata, which is a plus.
There are now eight arguments to vn_open, which is a lot, but it's
still simpler than struct nameidata:
at_dvp (directory vnode for openat() starting point, NULL ok)
pb (pathbuf)
nmode (additional namei flags)
fmode (open flags, converted from O_* to F*)
cmode (creation permissions)
ret_vp (pointer for returned vnode)
ret_domove (pointer for returned is-EMOVEFD flag)
ret_fd (pointer for returned file descriptor)
I have also rearranged the interface to fd_dupopen, which uses the
vn_open results, to match, and so that its return value argument is
last like it ought to be. This was chosen to be incompatible (as
opposed to just moving int arguments around, which isn't safe) in case
there were other calls to it somewhere. (There weren't.)
This patch is missing the doc change to vnsubr(9) but I'll deal with
that before committing.
It also builds but hasn't been tested yet; test results are pending.
Requires a kernel bump because a number of the callers of vn_open are
in modules.
Comments? Objections? Tentative plan is to commit this along with the
VOP_PARSEPATH changes (proposed years back), which also require a
kernel bump, plus some other pending kernel bump stuff other folks
have, in the next few days.
Note that cleaning out the hack entirely would be expensive (because a
similar alternative return value scheme is needed in the open methods
of struct cdevsw and bdevsw) so I'm not going to attempt it yet; plus
it's a natural part of the long-term device plumbing rearrangement I
posted about a few weeks back, and it makes more sense to do it as
part of that than separately.
Oh also, note that a couple of callers of vn_open were passing
LOCKPARENT and relying on vn_open to ignore it. That's been
rectified.
And before I forget: currently uses of vn_open that yield EMOVEFD when
the caller isn't prepared to handle it leak the fd produced. This
patch doesn't change that; doing so without creating new layer
violations would be a major nuisance. Avoiding this behavior sensibly
requires a complete cleanup of the hack, which prompts layering
violations because it's itself a layer violation.
* Does anyone know why dev/kloader.c calls namei and then vn_open on
the same path? I remember seeing this before and leaving it alone
because nobody I could find was sure what the deal was, but it's
unlikely to work as-is and increasingly likely to break over time...
diff -r 4c2d0a182ef8 external/cddl/osnet/sys/sys/vnode.h
--- a/external/cddl/osnet/sys/sys/vnode.h Sun Jun 27 18:13:54 2021 -0400
+++ b/external/cddl/osnet/sys/sys/vnode.h Mon Jun 28 11:05:45 2021 -0400
@@ -239,7 +239,6 @@
vnode_t **vpp, enum create crwhy, mode_t umask)
{
struct pathbuf *pb;
- struct nameidata nd;
int error;
ASSERT(seg == UIO_SYSSPACE);
@@ -248,11 +247,9 @@
ASSERT(umask == 0);
pb = pathbuf_create(pnamep);
- NDINIT(, LOOKUP, NOFOLLOW, pb);
- error = vn_open(, filemode, createmode);
+ error = vn_open(NULL, pb, 0, filemode, createmode, vpp, NULL, NULL);
if (error == 0) {
- VOP_UNLOCK(nd.ni_vp, 0);
- *vpp = nd.ni_vp;
+ VOP_UNLOCK(*vpp, 0);
}
pathbuf_destroy(pb);
return (error);
diff -r 4c2d0a182ef8 sys/dev/firmload.c
--- a/sys/dev/firmload.cSun Jun 27 18:13:54 2021 -0400
+++ b/sys/dev/firmload.cMon Jun 28 11:05:45 2021 -0400
@@ -204,7 +204,6 @@
firmware_open(const char *drvname, const char *imgname, firmware_handle_t *fhp)
{
struct pathbuf *pb;
- struct nameidata nd;
struct vattr va;
char *pnbuf, *path, *prefix;
firmware_handle_t fh;
@@ -236,8 +235,7 @@
error = ENOMEM;
break;
}
- NDINIT(, LOOKUP, FOLLOW | NOCHROOT, pb);
- error = vn_open(,