On Sun, Nov 9, 2025 at 1:38 AM Al Viro <[email protected]> wrote:
>
> There are two filename-related problems in io_uring and its
> interplay with audit.
>
> Filenames are imported when request is submitted and used when
> it is processed.  Unfortunately, the latter may very well
> happen in a different thread.  In that case the reference to
> filename is put into the wrong audit_context - that of submitting
> thread, not the processing one.  Audit logics is called by
> the latter, and it really wants to be able to find the names
> in audit_context current (== processing) thread.
>
> Another related problem is the headache with refcounts -
> normally all references to given struct filename are visible
> only to one thread (the one that uses that struct filename).
> io_uring violates that - an extra reference is stashed in
> audit_context of submitter.  It gets dropped when submitter
> returns to userland, which can happen simultaneously with
> processing thread deciding to drop the reference it got.
>
> We paper over that by making refcount atomic, but that means
> pointless headache for everyone.
>
> Solution: the notion of partially imported filenames.  Namely,
> already copied from userland, but *not* exposed to audit yet.
>
> io_uring can create that in submitter thread, and complete the
> import (obtaining the usual reference to struct filename) in
> processing thread.
>
> Object: struct delayed_filename.
>
> Primitives for working with it:
>
> delayed_getname(&delayed_filename, user_string) - copies the name
> from userland, returning 0 and stashing the address of (still incomplete)
> struct filename in delayed_filename on success and returning -E... on
> error.
>
> delayed_getname_uflags(&delayed_filename, user_string, atflags) - similar,
> in the same relation to delayed_getname() as getname_uflags() is to getname()
>
> complete_getname(&delayed_getname) - completes the import of filename stashed
> in delayed_getname and returns struct filename to caller, emptying 
> delayed_getname.
>
> dismiss_delayed_getname(&delayed_getname) - destructor; drops whatever
> might be stashed in delayed_getname, emptying it.
>
> Signed-off-by: Al Viro <[email protected]>
> ---
>  fs/namei.c           |  45 +++++++++++++++++--
>  include/linux/fs.h   |  10 +++++
>  io_uring/fs.c        | 101 +++++++++++++++++++++++--------------------
>  io_uring/openclose.c |  16 +++----
>  io_uring/statx.c     |  17 +++-----
>  io_uring/xattr.c     |  30 +++++--------
>  6 files changed, 129 insertions(+), 90 deletions(-)

I don't have any patches to share for this yet, but as a FYI, I've
started working on some audit patches to deal with this issue in a
more general way; the io_uring approach of splitting processing
between a prep and work stage causes audit problems beyond just
filenames.  My current thought is to setup an audit_context in the
io_uring prep stage for those ops that need auditing, and then carry
around the context in the io_kiocb for later use, swapping it to
current->audit_context when the work is performed.  Still plenty of
hand wavy stuff to sort out, and it's entirely possible that it
doesn't work out, or is too ugly to see posted, but it seemed relevant
to the patch.

Regardless, I think this approach is reasonable; best case it is a
stopgap until we have something more general, worst case it becomes a
more permanent fix.  I can live with either outcome.

Acked-by: Paul Moore <[email protected]>

-- 
paul-moore.com

Reply via email to