Re: [PATCH 24/32] vfs: syscall: Add fsopen() to prepare for superblock creation [ver #9]

Andy Lutomirski Thu, 12 Jul 2018 09:24:02 -0700

> On Jul 12, 2018, at 7:54 AM, David Howells <[email protected]> wrote:
> 
> Andy Lutomirski <[email protected]> wrote:
> 
>>> On Jul 11, 2018, at 12:22 AM, David Howells <[email protected]> wrote:
>>> 
>>> Andy Lutomirski <[email protected]> wrote:
>>> 
>>>>>  sfd = fsopen("ext4", FSOPEN_CLOEXEC);
>>>>>  write(sfd, "s /dev/sdb1"); // note I'm ignoring write's length arg
>>>> 
>>>> Imagine some malicious program passes sfd as stdout to a setuid
>>>> program. That program gets persuaded to write "s /etc/shadow".  What
>>>> happens?  You’re okay as long as *every single fs* gets it right, but
>>>> that’s asking a lot.
>>> 
>>> Do note that you must already have CAP_SYS_ADMIN to be able to call
>>> fsopen().
>> 
>> If you're not allowing it already, someone will want user namespace
>> root to be able to use this very, very soon.
> 
> Yeah, I'm sure.  And I've been thinking on how to deal with it.
> 
> I think we *have* to open the source files/devices with the creds of whoever
> called fsopen() or fspick() - that way you can't upgrade your privs by passing
> your context fd to a suid program.  To enforce this, I think it's simplest for
> fscontext_write() to call override_creds() right after taking the uapi_mutex
> and then call revert_creds() right before dropping the mutex.
> 

If you make a syscall that attaches a block device to an fscontext, you don’t 
need any of this.  Heck, someone might actually *want* to grab a block device 
from a different namespace.

All this override_creds() stuff is maybe okay if we were fixing an old broken 
thing. But this is brand new.  And having write() call override_creds() and do 
nontrivial things is a fascinating attack surface.

Just imagine what blows up if I abuse fscontext to open a block device on a 
path that traverses an AFS mount or /proc/.../fd or similar.  Or if I splice() 
from a network filesystem into fscontext.

(Al- can’t we just stop allowing splice() at all on things that don’t use 
iov_iter?)

> Another thing we might want to look at is to allow a supervisory process to
> examine the context before permitting the create/reconfigure action to
> proceed.  It might also be possible to do this through the LSM.

Cc Tycho. He’s working on this exact idea using seccomp. And he’d probably 
much, much prefer if configuration of an fscontext didn’t use a 
performance-critical syscall like write().

As a straw man, I suggest:

fsconfigure(contextfd, ADD_BLOCKDEV, dfd, path, flags);

fsconfigure(contextfd, ADD_OPTION, 0, “foo=bar”, flags);

Etc.
Re: [PATCH 24/32] vfs: syscall: Add fsopen() to prepare for superblock creation [ver #9]

Reply via email to