/* Background. */
Container runtimes or other administrative management processes will
often interact with root filesystems while in the host mount namespace,
because the cost of doing a chroot(2) on every operation is too
prohibitive (especially in Go, which cannot safely use vfork). However,
a malicious program can trick the management process into doing
operations on files outside of the root filesystem through careful
crafting of symlinks.

Most programs that need this feature have attempted to make this process
safe, by doing all of the path resolution in userspace (with symlinks
being scoped to the root of the malicious root filesystem).
Unfortunately, this method is prone to foot-guns and usually such
implementations have subtle security bugs.

Thus, what userspace needs is a way to resolve a path as though it were
in a chroot(2) -- with all absolute symlinks being resolved relative to
the dirfd root (and ".." components being stuck under the dirfd root[1])
It is much simpler and more straight-forward to provide this
functionality in-kernel (because it can be done far more cheaply and
correctly).

More classical applications that also have this problem (which have
their own potentially buggy userspace path sanitisation code) include
web servers, archive extraction tools, network file servers, and so on.

[1]: At the moment, ".." and magic-link jumping are disallowed for the
     same reason it is disabled for LOOKUP_BENEATH -- currently it is
     not safe to allow it. Future patches may enable it unconditionally
     once we have resolved the possible races (for "..") and semantics
     (for magic-link jumping).

/* Userspace API. */
LOOKUP_IN_ROOT will be exposed to userspace through openat2(2).

There is a slight change in behaviour regarding pathnames -- if the
pathname is absolute then the dirfd is still used as the root of
resolution of LOOKUP_IN_ROOT is specified (this is to avoid obvious
foot-guns, at the cost of a minor API inconsistency).

Signed-off-by: Aleksa Sarai <cyp...@cyphar.com>
---
 fs/namei.c            | 5 +++++
 include/linux/namei.h | 3 ++-
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/fs/namei.c b/fs/namei.c
index 54fdbdfbeb94..9d00b138f54c 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -2277,6 +2277,11 @@ static const char *path_init(struct nameidata *nd, 
unsigned flags)
 
        nd->m_seq = read_seqbegin(&mount_lock);
 
+       /* LOOKUP_IN_ROOT treats absolute paths as being relative-to-dirfd. */
+       if (flags & LOOKUP_IN_ROOT)
+               while (*s == '/')
+                       s++;
+
        /* Figure out the starting path and root (if needed). */
        if (*s == '/') {
                error = nd_jump_root(nd);
diff --git a/include/linux/namei.h b/include/linux/namei.h
index 35a1bf074ff1..c7a010570d05 100644
--- a/include/linux/namei.h
+++ b/include/linux/namei.h
@@ -47,8 +47,9 @@ enum {LAST_NORM, LAST_ROOT, LAST_DOT, LAST_DOTDOT, LAST_BIND};
 #define LOOKUP_NO_MAGICLINKS   0x080000 /* No /proc/$pid/fd/ "symlink" 
crossing. */
 #define LOOKUP_NO_SYMLINKS     0x100000 /* No symlink crossing *at all*.
                                            Implies LOOKUP_NO_MAGICLINKS. */
+#define LOOKUP_IN_ROOT         0x200000 /* Treat dirfd as %current->fs->root. 
*/
 /* LOOKUP_* flags which do scope-related checks based on the dirfd. */
-#define LOOKUP_IS_SCOPED LOOKUP_BENEATH
+#define LOOKUP_IS_SCOPED (LOOKUP_BENEATH | LOOKUP_IN_ROOT)
 
 extern int path_pts(struct path *path);
 
-- 
2.23.0

Reply via email to