On Thu, Jun 25, 2026 at 10:50 AM Christian Brauner <[email protected]> wrote:
> The arguments I have heard from various people so far are:
>
> (1) Userspace would be able to clone a random chroot to /woot and run a
>     binary from it without having to set up a complicated sandbox
>     effectively making dynamically linked binaries more like static
>     binaries in a sense.
>
> (2) Quote:
>     "If you debootstrap/dnf a chroot to some location in your
>     home dir and try to run a binary from it, that it tries to load the
>     libraries from your /usr is a pretty unintuitive and not at all
>     useful behavior."
>
> (3) Quote:
>     "[Various remote execution things run in locked down containers that
>     disable userns, which makes the sandbox impossible and hence our
>     builds wouldn't work there."

FWIW I think someone also mentioned to me that it would make things
easier for them if they could build a piece of software in one
environment and then bundle it up with all required libraries and such
and run it in a very different environment, without
container/sandboxing stuff and without static linking. But I guess
that's kinda niche.

> I'm discounting "Oh, userspace already allows this so why not the
> kernel.". I think that's generally a bad argument. Kernel and userspace
> aren't really alike in that regard.
>
> The userspace ORIGIN concept is guarded behind AT_SECURE. The kernel has

(To be pedantic: The userspace $ORIGIN concept is only partially gated
on AT_SECURE - glibc has an allowlist of acceptable library
directories, listed in "/lib64/ld-linux-x86-64.so.2 --list-diagnostics
| grep ^path.system_dirs". But clearly we wouldn't want to mirror that
in the kernel.)

> to enforce the same rule. That means the loader now depends on the type
> of binary. I think this is a rather serious issue.

And annoyingly, the bprm->secureexec flag can change in
security_bprm_creds_from_file(), which is currently reached from
begin_new_exec(), which is called after we've already opened the
interpreter, so accessing ->secureexec state during the interpreter
lookup would require some refactoring. So I think this is a doable
change, but would require more work.

Or we could take the easy way out and say "the kernel always rejects
this unless LSM_UNSAFE_NO_NEW_PRIVS is set", which would make it clear
that this can't lead to privilege escalation and would serve as an
incentive for people to stop doing stuff that relies on setuid
binaries or privileged apparmor/selinux transitions. :P

> First, it creates confusion in userspace what loader is used. Second, it
> means anything that any build/chroot that uses AT_SECURE binaries now
> has to use the sandboxing solution anyway or risk that some binaries use
> the system loader and others the chroot loader.

I think we would probably just fail the execve() attempt if we see
$ORIGIN in the interpreter in an AT_SECURE execution? Since the
interpreter field does not allow listing multiple alternatives.

> Ignoring AT_SECURE, LSMs likely will need a say in whether that ORIGIN
> thing gets honored or not introducing yet another vector where this can
> be overriden or ignored.
>
> Also, we change long-standing kernel behavior which will be very
> surprising for any userspace that might implicitly rely on the fact that
> the system loader is used. So even if we were to do something like this
> it would very likely have to be configurable in some way.

I think the proposed patch will only change behavior if the
interpreter path starts with "$ORIGIN"? That wouldn't work on existing
kernels unless you have a directory literally named "$ORIGIN" in the
cwd, because "$ORIGIN/..." would be interpreted as a normal relative
path.

> This makes this all ripe for malicious loader injection attacks. And we
> need to consider this possibility.
>
> So I'm not enthusiastic about this. I want this to be consistent.

Reply via email to