On 2018-10-27, Al Viro <v...@zeniv.linux.org.uk> wrote: > On Sat, Oct 27, 2018 at 06:17:29PM +1100, Aleksa Sarai wrote: > > > I'm going to send out a v4 "soon" but I would like to know what folks > > think about having resolveat(2) (or similar) to separate the scoping O_* > > flags and produce an O_PATH -- since unsupported O_* flags are ignored > > by older kernels userspace will have to do some plenty of checking after > > each path operation. > > > > Personally, I believe this (along with AT_EMPTY_PATH for openat(2)) > > would help with some other O_PATH issues. > > The trouble with resolveat(2) is that for anything directory-modifying > you really want directory locked before the lookup for last component. > IOW, funlink(2) et.al. are hopeless.
Ah, right... In those cases I think that AT_SYMLINK_NOFOLLOW could help, or maybe we would need to have some of the scoping flags for other syscalls (though this would be an issue in either case for scoping unlinkat(2) -- even if we used O_BENEATH). :/ But my main issue at the moment with O_PATH is that /proc/self/fd/... re-opening allows for some very odd delayed-access-check attacks. openat(2) doesn't give you an O_EMPTYPATH but that is what you get with /proc. For instance, take /proc/self/exe. Tautologically, it is impossible to open it O_RDWR -- if you are resolving it through an "exe" magic symlink then it is being used as a process's ->mm (and thus is locked from writing). *However* you can open it O_PATH and then later re-open it through /proc/self/fd. We had cases where a container runtime joining a container would be able to do this and overwrite the container binary *on the host*. This has been mitigated now (as part of CVE-2016-9962), but I would be very shocked if there was no other places where this sort of thing would happen. My proposal for resolveat(2) would let us have some sort of "I want these access bits" API for O_PATH. Of course there are some quite not-nice changes I think you'd need to allow for this usecase -- my back-of-the-envelope proposal would be to stash away the fmode bits inside 'struct path' so that do_last() can see whether we are doing a re-open of an existing 'struct file' (but I'm sure this sounds awful). Is this a problem you think deserves solving / is there a better way of going about it? Thanks. -- Aleksa Sarai Senior Software Engineer (Containers) SUSE Linux GmbH <https://www.cyphar.com/>
signature.asc
Description: PGP signature