On 2026-03-11, Andy Lutomirski <[email protected]> wrote: > On Tue, Mar 10, 2026 at 9:49 PM Aleksa Sarai <[email protected]> wrote: > > > > On 2026-03-09, Christian Brauner <[email protected]> wrote: > > > > > On Sat, 2026-03-07 at 10:56 -0800, Andy Lutomirski wrote: > > > > > > I think this needs more clarification as to what "regular" means, > > > > > > since S_IFREG may not be sufficient. The UAPI group page says: > > > > > > > > > > > > Use-Case: this would be very useful to write secure programs that > > > > > > want > > > > > > to avoid being tricked into opening device nodes with special > > > > > > semantics while thinking they operate on regular files. This is > > > > > > particularly relevant as many device nodes (or even FIFOs) come with > > > > > > blocking I/O (or even blocking open()!) by default, which is not > > > > > > expected from regular files backed by “fast” disk I/O. Consider > > > > > > implementation of a naive web browser which is pointed to > > > > > > file://dev/zero, not expecting an endless amount of data to read. > > > > > > > > > > > > What about procfs? What about sysfs? What about /proc/self/fd/17 > > > > > > where that fd is a memfd? What about files backed by non-"fast" > > > > > > disk > > > > > > I/O like something on a flaky USB stick or a network mount or FUSE? > > > > > > > > > > > > Are we concerned about blocking open? (open blocks as a matter of > > > > > > course.) Are we concerned about open having strange side effects? > > > > > > Are we concerned about write having strange side effects? Are we > > > > > > concerned about cases where opening the file as root results in > > > > > > elevated privilege beyond merely gaining the ability to write to > > > > > > that > > > > > > specific path on an ordinary filesystem? > > > > > > I think this is opening up a barrage of question that I'm not sure are > > > all that useful. The ability to only open regular file isn't intended to > > > defend against hung FUSE or NFS servers or other random Linux > > > special-sauce murder-suicide file descriptor traps. For a lot of those > > > we have O_PATH which can easily function with the new extension. A lot > > > of the other special-sauce files (most anonymous inode fds) cannot even > > > be reopened via e.g., /proc. > > > > Indeed, I see OPENAT2_REGULAR as a way of optimising the tedious checks > > that userspace does using O_PATH+/proc/self/fd/$n re-opening when > > dealing with regular files. > > Can you give a brief decription or a link to what these checks are and > what problem they solve?
There are a few variations, but in this particular case they are just doing fstat() and then checking whether the file is a regular file (i.e., S_IFREG) or not. A container rootfs can contain arbitrary files (because container images are just tar archives, usually with no restrictions on inodes -- a fair few container runtimes assume that the devices cgroup is sufficient to protect against the container overwriting your rootfs). The S_IFREG check avoids an administrative process from being tricked into opening a block device or an endlessly-streaming FIFO -- if you also use RESOLVE_NO_XDEV you can also make sure that you don't land on procfs or sysfs by accident. I will say that in a previous version of this patchset I said that I would prefer this be done with an allow-bitmask of S_IFMT rather than a single O_REGULAR toggle -- this would allow for usecases such as "only open a regular file or directory" (inode_type_can_chattr() from systemd is a practical example of this kind of usage) or "anything except for block/char devices", but the definition of S_IFBLK (S_IFCHR|S_IFDIR) makes this a little too ugly. :/ -- Aleksa Sarai https://www.cyphar.com/
signature.asc
Description: PGP signature

