Re: [PATCH v9 05/10] namei: O_BENEATH-style path resolution flags

2019-07-17 Thread Aleksa Sarai
On 2019-07-14, Al Viro  wrote:
> On Sun, Jul 14, 2019 at 05:00:29PM +1000, Aleksa Sarai wrote:
> > The basic property being guaranteed by LOOKUP_IN_ROOT is that it will
> > not result in resolution of a path component which was not inside the
> > root of the dirfd tree at some point during resolution (and that all
> > absolute symlink and ".." resolution will be done relative to the
> > dirfd). This may smell slightly of chroot(2), because unfortunately it
> > is a similar concept -- the reason for this is to allow for a more
> > efficient way to safely resolve paths inside a rootfs than spawning a
> > separate process to then pass back the fd to the caller.
> 
> IDGI...  If attacker can modify your subtree, you have already lost -
> after all, they can make anything appear inside that tree just before
> your syscall is made and bring it back out immediately afterwards.
> And if they can't, what is the race you are trying to protect against?
> Confused...

I'll be honest, this code mostly exists because Jann Horn said that it
was necessary in order for this interface to be safe against those kinds
of attacks. Though, it's also entirely possible I just am
mis-remembering the attack scenario he described when I posted v1 of
this series last year.

The use-case I need this functionality for (as do other container
runtimes) is one where you are trying to safely interact with a
directory tree that is a (malicious) container's root filesystem -- so
the container won't be able to move the directory tree root, nor can
they move things outside the rootfs into it (or the reverse). Users
dealing with FTP, web, or file servers probably have similar
requirements.

There is an obvious race condition if you allow the attacker to move the
root (I give an example and test-case of it in the last patch in the
series), and given that it is fairly trivial to defend against I don't
see the downside in including it? But it's obviously your call -- and
maybe Jann Horn can explain the reasoning behind this much better than I
can.

-- 
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH



signature.asc
Description: PGP signature


Re: [PATCH v9 05/10] namei: O_BENEATH-style path resolution flags

2019-07-16 Thread Aleksa Sarai
On 2019-07-14, Al Viro  wrote:
> On Sat, Jul 13, 2019 at 03:41:53AM +0100, Al Viro wrote:
> > On Fri, Jul 12, 2019 at 04:00:26PM +0100, Al Viro wrote:
> > > On Fri, Jul 12, 2019 at 02:25:53PM +0100, Al Viro wrote:
> > > 
> > > > if (flags & LOOKUP_BENEATH) {
> > > > nd->root = nd->path;
> > > > if (!(flags & LOOKUP_RCU))
> > > > path_get(>root);
> > > > else
> > > > nd->root_seq = nd->seq;
> > > 
> > > BTW, this assignment is needed for LOOKUP_RCU case.  Without it
> > > you are pretty much guaranteed that lazy pathwalk will fail,
> > > when it comes to complete_walk().
> > > 
> > > Speaking of which, what would happen if LOOKUP_ROOT/LOOKUP_BENEATH
> > > combination would someday get passed?
> > 
> > I don't understand what's going on with ->r_seq in there - your
> > call of path_is_under() is after having (re-)sampled rename_lock,
> > but if that was the only .. in there, who's going to recheck
> > the value?  For that matter, what's to guarantee that the thing
> > won't get moved just as you are returning from handle_dots()?
> > 
> > IOW, what does LOOKUP_IN_ROOT guarantee for caller (openat2())?
> 
> Sigh...  Usual effects of trying to document things:
> 
> 1) LOOKUP_NO_EVAL looks bogus.  It had been introduced by commit 57d4657716ac
> (audit: ignore fcaps on umount) and AFAICS it's crap.  It is set in
> ksys_umount() and nowhere else.  It's ignored by everything except
> filename_mountpoint().  The thing is, call graph for filename_mountpoint()
> is
>   filename_mountpoint()
>   <- user_path_mountpoint_at()
>   <- ksys_umount()
>   <- kern_path_mountpoint()
>   <- autofs_dev_ioctl_ismountpoint()
>   <- find_autofs_mount()
>   <- autofs_dev_ioctl_open_mountpoint()
>   <- autofs_dev_ioctl_requester()
>   <- autofs_dev_ioctl_ismountpoint()
> In other words, that flag is basically "was filename_mountpoint()
> been called by umount(2) or has it come from an autofs ioctl?".
> And looking at the rationale in that commit, autofs ioctls need
> it just as much as umount(2) does.  Why is it not set for those
> as well?  And why is it conditional at all?

In addition, LOOKUP_NO_EVAL == LOOKUP_OPEN (0x100). Is that meant to be
the case? Also I just saw you have a patch in work.namei that fixes this
up -- do you want me to rebase on top of that?

-- 
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH



signature.asc
Description: PGP signature


Re: [PATCH v9 05/10] namei: O_BENEATH-style path resolution flags

2019-07-14 Thread Al Viro
On Sun, Jul 14, 2019 at 05:00:29PM +1000, Aleksa Sarai wrote:

> The basic property being guaranteed by LOOKUP_IN_ROOT is that it will
> not result in resolution of a path component which was not inside the
> root of the dirfd tree at some point during resolution (and that all
> absolute symlink and ".." resolution will be done relative to the
> dirfd). This may smell slightly of chroot(2), because unfortunately it
> is a similar concept -- the reason for this is to allow for a more
> efficient way to safely resolve paths inside a rootfs than spawning a
> separate process to then pass back the fd to the caller.

IDGI...  If attacker can modify your subtree, you have already lost -
after all, they can make anything appear inside that tree just before
your syscall is made and bring it back out immediately afterwards.
And if they can't, what is the race you are trying to protect against?
Confused...


Re: [PATCH v9 05/10] namei: O_BENEATH-style path resolution flags

2019-07-14 Thread Aleksa Sarai
On 2019-07-12, Al Viro  wrote:
> On Fri, Jul 12, 2019 at 01:55:52PM +0100, Al Viro wrote:
> > On Fri, Jul 12, 2019 at 01:39:24PM +0100, Al Viro wrote:
> > > On Fri, Jul 12, 2019 at 08:57:45PM +1000, Aleksa Sarai wrote:
> > > 
> > > > > > @@ -2350,9 +2400,11 @@ static const char *path_init(struct 
> > > > > > nameidata *nd, unsigned flags)
> > > > > > s = ERR_PTR(error);
> > > > > > return s;
> > > > > > }
> > > > > > -   error = dirfd_path_init(nd);
> > > > > > -   if (unlikely(error))
> > > > > > -   return ERR_PTR(error);
> > > > > > +   if (likely(!nd->path.mnt)) {
> > > > > 
> > > > > Is that a weird way of saying "if we hadn't already called 
> > > > > dirfd_path_init()"?
> > > > 
> > > > Yes. I did it to be more consistent with the other "have we got the
> > > > root" checks elsewhere. Is there another way you'd prefer I do it?
> > > 
> > > "Have we got the root" checks are inevitable evil; here you are making the
> > > control flow in a single function hard to follow.
> > > 
> > > I *think* what you are doing is
> > >   absolute pathname, no LOOKUP_BENEATH:
> > >   set_root
> > >   error = nd_jump_root(nd)
> > >   else
> > >   error = dirfd_path_init(nd)
> > >   return unlikely(error) ? ERR_PTR(error) : s;
> > > which should be a lot easier to follow (not to mention shorter), but I 
> > > might
> > > be missing something in all of that.
> > 
> > PS: if that's what's going on, I would be tempted to turn the entire
> > path_init() part into this:
> > if (flags & LOOKUP_BENEATH)
> > while (*s == '/')
> > s++;
> > in the very beginning (plus the handling of nd_jump_root() prototype
> > change, but that belongs with nd_jump_root() change itself, obviously).
> > Again, I might be missing something here...
> 
> Argh... I am, at that - you have setting path->root (and grabbing it)
> in LOOKUP_BENEATH cases and you do it after dirfd_path_init().  So
> how about
>   if (flags & LOOKUP_BENEATH)
>   while (*s == '/')
>   s++;

I can do this for LOOKUP_IN_ROOT, but currently the semantics for
LOOKUP_BENEATH is that absolute paths will return -EXDEV
indiscriminately (nd_jump_root() errors out with LOOKUP_BENEATH). To be
honest, the check could actually just be:

  if (flags & LOOKUP_BENEATH)
if (*s == '/')
  return ERR_PTR(-EXDEV);

(Though we'd still need -EXDEV in nd_jump_root() for obvious reasons.)

The logic being that an absolute path means that the resolution starts
out without being "beneath" the starting point -- thus violating the
contract of LOOKUP_BENEATH. And since the "handle absolute paths like
they're scoped to the root" is only implemented for LOOKUP_IN_ROOT, I'd
think it's a bit odd to have LOOKUP_BENEATH do it too for absolute
paths.

I'll be honest, this patchset is more confusing to both of us because of
LOOKUP_BENEATH -- I've only kept it since it was part of the original
patchset (O_BENEATH). Personally I think more people will be far more
interested in LOOKUP_IN_ROOT. Does anyone mind if I drop the
LOOKUP_BENEATH parts of this series, and only keep LOOKUP_NO_* and
LOOKUP_IN_ROOT?

I make a change as you outlined for LOOKUP_IN_ROOT, though.

-- 
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH



signature.asc
Description: PGP signature


Re: [PATCH v9 05/10] namei: O_BENEATH-style path resolution flags

2019-07-14 Thread Aleksa Sarai
On 2019-07-13, Al Viro  wrote:
> On Fri, Jul 12, 2019 at 04:00:26PM +0100, Al Viro wrote:
> > On Fri, Jul 12, 2019 at 02:25:53PM +0100, Al Viro wrote:
> > 
> > >   if (flags & LOOKUP_BENEATH) {
> > >   nd->root = nd->path;
> > >   if (!(flags & LOOKUP_RCU))
> > >   path_get(>root);
> > >   else
> > >   nd->root_seq = nd->seq;
> > 
> > BTW, this assignment is needed for LOOKUP_RCU case.  Without it
> > you are pretty much guaranteed that lazy pathwalk will fail,
> > when it comes to complete_walk().
> > 
> > Speaking of which, what would happen if LOOKUP_ROOT/LOOKUP_BENEATH
> > combination would someday get passed?
> 
> I don't understand what's going on with ->r_seq in there - your
> call of path_is_under() is after having (re-)sampled rename_lock,
> but if that was the only .. in there, who's going to recheck
> the value?  For that matter, what's to guarantee that the thing
> won't get moved just as you are returning from handle_dots()?
> 
> IOW, what does LOOKUP_IN_ROOT guarantee for caller (openat2())?

I tried to explain this in the commit message for "namei: aggressively
check for nd->root escape on ".." resolution", but I probably could've
explained it better.

The basic property being guaranteed by LOOKUP_IN_ROOT is that it will
not result in resolution of a path component which was not inside the
root of the dirfd tree at some point during resolution (and that all
absolute symlink and ".." resolution will be done relative to the
dirfd). This may smell slightly of chroot(2), because unfortunately it
is a similar concept -- the reason for this is to allow for a more
efficient way to safely resolve paths inside a rootfs than spawning a
separate process to then pass back the fd to the caller.

We don't want to do a path_is_under() check for every ".." (otherwise
lookups have a quadratic slowdown when doing many ".."s), so I instead
only do a check after a rename or a mount (which are the only operations
which could change what ".." points to). And since we do the
path_is_under() check if m_seq or r_seq need a retry, we can re-take
them[+].

The main reason why I don't re-check path_is_under() after handle_dots()
is that there is no way to be sure that a racing rename didn't happen
after your last path_is_under() check. The rename could happen after the
syscall returns, after all.

So, the main purpose of the check is to make sure that a ".."s after a
rename doesn't result in an escape. If the rename happens after we've
traversed through a ".." that means that the ".." was inside the root in
the first place (a root ".." is handled by follow_dotdot). If the rename
happens after we've gone through handle_dots() and there is no
subsequent ".." then to userspace it looks identical to the rename
occurring after the syscall has returned. If there is a subsequent ".."
after a racing rename then we may have moved into a path that wasn't
path_is_under() and so we have to check it.

The only way I could see you could solve the race completely is if you
had a way for userspace to lock things from being able to be renamed (or
MS_MOVE'd). And that feels like a really bad idea to me.

[+]: You asked why don't I re-take m_seq. The reason is that I don't
 fully understand all the other m_seq checks being done during
 resolution, and we aren't definitely doing them all in
 handle_dots(). So I assumed re-taking it could result in me
 breaking RCU-walk which obviously would be bad. Since I am the only
 thing using nd->r_seq, I can re-take it without issue.

-- 
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH



signature.asc
Description: PGP signature


Re: [PATCH v9 05/10] namei: O_BENEATH-style path resolution flags

2019-07-13 Thread Al Viro
On Sat, Jul 13, 2019 at 03:41:53AM +0100, Al Viro wrote:
> On Fri, Jul 12, 2019 at 04:00:26PM +0100, Al Viro wrote:
> > On Fri, Jul 12, 2019 at 02:25:53PM +0100, Al Viro wrote:
> > 
> > >   if (flags & LOOKUP_BENEATH) {
> > >   nd->root = nd->path;
> > >   if (!(flags & LOOKUP_RCU))
> > >   path_get(>root);
> > >   else
> > >   nd->root_seq = nd->seq;
> > 
> > BTW, this assignment is needed for LOOKUP_RCU case.  Without it
> > you are pretty much guaranteed that lazy pathwalk will fail,
> > when it comes to complete_walk().
> > 
> > Speaking of which, what would happen if LOOKUP_ROOT/LOOKUP_BENEATH
> > combination would someday get passed?
> 
> I don't understand what's going on with ->r_seq in there - your
> call of path_is_under() is after having (re-)sampled rename_lock,
> but if that was the only .. in there, who's going to recheck
> the value?  For that matter, what's to guarantee that the thing
> won't get moved just as you are returning from handle_dots()?
> 
> IOW, what does LOOKUP_IN_ROOT guarantee for caller (openat2())?

Sigh...  Usual effects of trying to document things:

1) LOOKUP_NO_EVAL looks bogus.  It had been introduced by commit 57d4657716ac
(audit: ignore fcaps on umount) and AFAICS it's crap.  It is set in
ksys_umount() and nowhere else.  It's ignored by everything except
filename_mountpoint().  The thing is, call graph for filename_mountpoint()
is
filename_mountpoint()
<- user_path_mountpoint_at()
<- ksys_umount()
<- kern_path_mountpoint()
<- autofs_dev_ioctl_ismountpoint()
<- find_autofs_mount()
<- autofs_dev_ioctl_open_mountpoint()
<- autofs_dev_ioctl_requester()
<- autofs_dev_ioctl_ismountpoint()
In other words, that flag is basically "was filename_mountpoint()
been called by umount(2) or has it come from an autofs ioctl?".
And looking at the rationale in that commit, autofs ioctls need
it just as much as umount(2) does.  Why is it not set for those
as well?  And why is it conditional at all?

1b) ... because audit_inode() wants LOOKUP_... as the last argument,
only to remap it into AUDIT_..., that's why.  So audit needs something
guaranteed not to conflict with LOOKUP_PARENT (another flag getting
remapped).  So why do we bother with remapping those, anyway?  Let's look
at the callers:

fs/namei.c:933: audit_inode(nd->name, nd->stack[0].link.dentry, 0);
fs/namei.c:2353:audit_inode(name, path->dentry, flags & 
LOOKUP_PARENT);
fs/namei.c:2394:audit_inode(name, parent->dentry, 
LOOKUP_PARENT);
fs/namei.c:2721:audit_inode(name, path->dentry, flags & 
LOOKUP_NO_EVAL);
fs/namei.c:3302:audit_inode(nd->name, dir, LOOKUP_PARENT);
fs/namei.c:3336:audit_inode(nd->name, file->f_path.dentry, 0);
fs/namei.c:3371:audit_inode(nd->name, path.dentry, 0);
fs/namei.c:3389:audit_inode(nd->name, nd->path.dentry, 0);
fs/namei.c:3490:audit_inode(nd->name, child, 0);
fs/namei.c:3509:audit_inode(nd->name, path.dentry, 0);
ipc/mqueue.c:788:   audit_inode(name, dentry, 0);

In all but two of those we have a nice constant value - 0 or AUDIT_INODE_PARENT.
One of two exceptions is in filename_mountpoint(), and there we want
unconditional AUDIT_INODE_NOEVAL (see above).  What of the other?  It's
if (likely(!retval))
audit_inode(name, path->dentry, flags & LOOKUP_PARENT);
in filename_lookup().  And that is bogus as well.  filename_lookupat() would
better *NOT* get LOOKUP_PARENT in flags.  And it doesn't - not since
commit 8bcb77fabd7c (namei: split off filename_lookupat() with LOOKUP_PARENT)
back in 2015.  In filename_parentat() introduced there we have
audit_inode(name, parent->dentry, LOOKUP_PARENT);
and at the same point the call in filename_lookupat() should've become
audit_inode(name, path->dentry, 0);
It hadn't; my fault.  And after fixing that everything becomes nice and
unconditional - the last argument of audit_inode() is always an AUDIT_...
constant or zero.  Moving AUDIT_... definitions outside of ifdef on
CONFIG_AUDITSYSCALL, getting rid of remapping in audit_inode() and
passing the right values in 3 callers that don't pass 0 and LOOKUP_NO_EVAL
can go to hell.

Any objections from audit folks?

2) comment in namei.h is seriously out of sync with reality.  To quote:
 *  - follow links at the end
OK, that's LOOKUP_FOLLOW (1)
 *  - require a directory
... and LOOKUP_DIRECTORY (2)
 *  - ending slashes ok even for nonexistent files
... used to be about LOOKUP_CONTINUE (eight years dead now)
 *  - internal "there are more path components" flag
... LOOKUP_PARENT (16)
 *  - dentry cache is untrusted; force a real lookup
... LOOKUP_REVAL (32)
 *  - suppress terminal 

Re: [PATCH v9 05/10] namei: O_BENEATH-style path resolution flags

2019-07-12 Thread Al Viro
On Fri, Jul 12, 2019 at 04:00:26PM +0100, Al Viro wrote:
> On Fri, Jul 12, 2019 at 02:25:53PM +0100, Al Viro wrote:
> 
> > if (flags & LOOKUP_BENEATH) {
> > nd->root = nd->path;
> > if (!(flags & LOOKUP_RCU))
> > path_get(>root);
> > else
> > nd->root_seq = nd->seq;
> 
> BTW, this assignment is needed for LOOKUP_RCU case.  Without it
> you are pretty much guaranteed that lazy pathwalk will fail,
> when it comes to complete_walk().
> 
> Speaking of which, what would happen if LOOKUP_ROOT/LOOKUP_BENEATH
> combination would someday get passed?

I don't understand what's going on with ->r_seq in there - your
call of path_is_under() is after having (re-)sampled rename_lock,
but if that was the only .. in there, who's going to recheck
the value?  For that matter, what's to guarantee that the thing
won't get moved just as you are returning from handle_dots()?

IOW, what does LOOKUP_IN_ROOT guarantee for caller (openat2())?


Re: [PATCH v9 05/10] namei: O_BENEATH-style path resolution flags

2019-07-12 Thread Al Viro
On Fri, Jul 12, 2019 at 02:25:53PM +0100, Al Viro wrote:

>   if (flags & LOOKUP_BENEATH) {
>   nd->root = nd->path;
>   if (!(flags & LOOKUP_RCU))
>   path_get(>root);
>   else
>   nd->root_seq = nd->seq;

BTW, this assignment is needed for LOOKUP_RCU case.  Without it
you are pretty much guaranteed that lazy pathwalk will fail,
when it comes to complete_walk().

Speaking of which, what would happen if LOOKUP_ROOT/LOOKUP_BENEATH
combination would someday get passed?


Re: [PATCH v9 05/10] namei: O_BENEATH-style path resolution flags

2019-07-12 Thread Al Viro
On Fri, Jul 12, 2019 at 01:55:52PM +0100, Al Viro wrote:
> On Fri, Jul 12, 2019 at 01:39:24PM +0100, Al Viro wrote:
> > On Fri, Jul 12, 2019 at 08:57:45PM +1000, Aleksa Sarai wrote:
> > 
> > > > > @@ -2350,9 +2400,11 @@ static const char *path_init(struct nameidata 
> > > > > *nd, unsigned flags)
> > > > >   s = ERR_PTR(error);
> > > > >   return s;
> > > > >   }
> > > > > - error = dirfd_path_init(nd);
> > > > > - if (unlikely(error))
> > > > > - return ERR_PTR(error);
> > > > > + if (likely(!nd->path.mnt)) {
> > > > 
> > > > Is that a weird way of saying "if we hadn't already called 
> > > > dirfd_path_init()"?
> > > 
> > > Yes. I did it to be more consistent with the other "have we got the
> > > root" checks elsewhere. Is there another way you'd prefer I do it?
> > 
> > "Have we got the root" checks are inevitable evil; here you are making the
> > control flow in a single function hard to follow.
> > 
> > I *think* what you are doing is
> > absolute pathname, no LOOKUP_BENEATH:
> > set_root
> > error = nd_jump_root(nd)
> > else
> > error = dirfd_path_init(nd)
> > return unlikely(error) ? ERR_PTR(error) : s;
> > which should be a lot easier to follow (not to mention shorter), but I might
> > be missing something in all of that.
> 
> PS: if that's what's going on, I would be tempted to turn the entire
> path_init() part into this:
>   if (flags & LOOKUP_BENEATH)
>   while (*s == '/')
>   s++;
> in the very beginning (plus the handling of nd_jump_root() prototype
> change, but that belongs with nd_jump_root() change itself, obviously).
> Again, I might be missing something here...

Argh... I am, at that - you have setting path->root (and grabbing it)
in LOOKUP_BENEATH cases and you do it after dirfd_path_init().  So
how about
if (flags & LOOKUP_BENEATH)
while (*s == '/')
s++;
before the whole thing and
if (*s == '/') { /* can happen only without LOOKUP_BENEATH */
set_root(nd);
error = nd_jump_root(nd);
if (unlikely(error))
return ERR_PTR(error);
} else if (nd->dfd == AT_FDCWD) {
if (flags & LOOKUP_RCU) {
struct fs_struct *fs = current->fs;
unsigned seq;

do {
seq = read_seqcount_begin(>seq);
nd->path = fs->pwd;
nd->inode = nd->path.dentry->d_inode;
nd->seq = 
__read_seqcount_begin(>path.dentry->d_seq);
} while (read_seqcount_retry(>seq, seq));
} else {
get_fs_pwd(current->fs, >path);
nd->inode = nd->path.dentry->d_inode;
}  
} else {
/* Caller must check execute permissions on the starting path 
component */
struct fd f = fdget_raw(nd->dfd);
struct dentry *dentry;

if (!f.file)
return ERR_PTR(-EBADF);

dentry = f.file->f_path.dentry;

if (*s && unlikely(!d_can_lookup(dentry))) {
fdput(f);
return ERR_PTR(-ENOTDIR);
}

nd->path = f.file->f_path;
if (flags & LOOKUP_RCU) {
nd->inode = nd->path.dentry->d_inode;
nd->seq = read_seqcount_begin(>path.dentry->d_seq);
} else {
path_get(>path);
nd->inode = nd->path.dentry->d_inode;
}
fdput(f);
}
if (flags & LOOKUP_BENEATH) {
nd->root = nd->path;
if (!(flags & LOOKUP_RCU))
path_get(>root);
else
nd->root_seq = nd->seq;
}
return s;
replacing the part in the end?  Makes for much smaller change; it might
very well still make sense to add dirfd_path_init() as a separate
cleanup (perhaps with the *s == '/' case included), though.


Re: [PATCH v9 05/10] namei: O_BENEATH-style path resolution flags

2019-07-12 Thread Al Viro
On Fri, Jul 12, 2019 at 01:39:24PM +0100, Al Viro wrote:
> On Fri, Jul 12, 2019 at 08:57:45PM +1000, Aleksa Sarai wrote:
> 
> > > > @@ -2350,9 +2400,11 @@ static const char *path_init(struct nameidata 
> > > > *nd, unsigned flags)
> > > > s = ERR_PTR(error);
> > > > return s;
> > > > }
> > > > -   error = dirfd_path_init(nd);
> > > > -   if (unlikely(error))
> > > > -   return ERR_PTR(error);
> > > > +   if (likely(!nd->path.mnt)) {
> > > 
> > > Is that a weird way of saying "if we hadn't already called 
> > > dirfd_path_init()"?
> > 
> > Yes. I did it to be more consistent with the other "have we got the
> > root" checks elsewhere. Is there another way you'd prefer I do it?
> 
> "Have we got the root" checks are inevitable evil; here you are making the
> control flow in a single function hard to follow.
> 
> I *think* what you are doing is
>   absolute pathname, no LOOKUP_BENEATH:
>   set_root
>   error = nd_jump_root(nd)
>   else
>   error = dirfd_path_init(nd)
>   return unlikely(error) ? ERR_PTR(error) : s;
> which should be a lot easier to follow (not to mention shorter), but I might
> be missing something in all of that.

PS: if that's what's going on, I would be tempted to turn the entire
path_init() part into this:
if (flags & LOOKUP_BENEATH)
while (*s == '/')
s++;
in the very beginning (plus the handling of nd_jump_root() prototype
change, but that belongs with nd_jump_root() change itself, obviously).
Again, I might be missing something here...


Re: [PATCH v9 05/10] namei: O_BENEATH-style path resolution flags

2019-07-12 Thread Al Viro
On Fri, Jul 12, 2019 at 08:57:45PM +1000, Aleksa Sarai wrote:

> > > @@ -2350,9 +2400,11 @@ static const char *path_init(struct nameidata *nd, 
> > > unsigned flags)
> > >   s = ERR_PTR(error);
> > >   return s;
> > >   }
> > > - error = dirfd_path_init(nd);
> > > - if (unlikely(error))
> > > - return ERR_PTR(error);
> > > + if (likely(!nd->path.mnt)) {
> > 
> > Is that a weird way of saying "if we hadn't already called 
> > dirfd_path_init()"?
> 
> Yes. I did it to be more consistent with the other "have we got the
> root" checks elsewhere. Is there another way you'd prefer I do it?

"Have we got the root" checks are inevitable evil; here you are making the
control flow in a single function hard to follow.

I *think* what you are doing is
absolute pathname, no LOOKUP_BENEATH:
set_root
error = nd_jump_root(nd)
else
error = dirfd_path_init(nd)
return unlikely(error) ? ERR_PTR(error) : s;
which should be a lot easier to follow (not to mention shorter), but I might
be missing something in all of that.


Re: [PATCH v9 05/10] namei: O_BENEATH-style path resolution flags

2019-07-12 Thread Aleksa Sarai
On 2019-07-12, Al Viro  wrote:
> On Sun, Jul 07, 2019 at 12:57:32AM +1000, Aleksa Sarai wrote:
> > @@ -1442,8 +1464,11 @@ static int follow_dotdot_rcu(struct nameidata *nd)
> > struct inode *inode = nd->inode;
> >  
> > while (1) {
> > -   if (path_equal(>path, >root))
> > +   if (path_equal(>path, >root)) {
> > +   if (unlikely(nd->flags & LOOKUP_BENEATH))
> > +   return -EXDEV;
> 
> > @@ -1468,6 +1493,8 @@ static int follow_dotdot_rcu(struct nameidata *nd)
> > return -ECHILD;
> > if (>mnt == nd->path.mnt)
> > break;
> > +   if (unlikely(nd->flags & LOOKUP_XDEV))
> > +   return -EXDEV;
> > /* we know that mountpoint was pinned */
> > nd->path.dentry = mountpoint;
> > nd->path.mnt = >mnt;
> > @@ -1482,6 +1509,8 @@ static int follow_dotdot_rcu(struct nameidata *nd)
> > return -ECHILD;
> > if (!mounted)
> > break;
> > +   if (unlikely(nd->flags & LOOKUP_XDEV))
> > +   return -EXDEV;
> 
> Are you sure these failure exits in follow_dotdot_rcu() won't give
> suprious hard errors?

I could switch to -ECHILD for the *_rcu() checks if you'd prefer that.
Though, I'd have (probably naively) thought that you'd have already
gotten -ECHILD from the seqlock checks if there was a race during ".."
handling.

> > +   if (unlikely(nd->flags & LOOKUP_BENEATH)) {
> > +   error = dirfd_path_init(nd);
> > +   if (unlikely(error))
> > +   return ERR_PTR(error);
> > +   nd->root = nd->path;
> > +   if (!(nd->flags & LOOKUP_RCU))
> > +   path_get(>root);
> > +   }
> > if (*s == '/') {
> > if (likely(!nd->root.mnt))
> > set_root(nd);
> > @@ -2350,9 +2400,11 @@ static const char *path_init(struct nameidata *nd, 
> > unsigned flags)
> > s = ERR_PTR(error);
> > return s;
> > }
> > -   error = dirfd_path_init(nd);
> > -   if (unlikely(error))
> > -   return ERR_PTR(error);
> > +   if (likely(!nd->path.mnt)) {
> 
> Is that a weird way of saying "if we hadn't already called dirfd_path_init()"?

Yes. I did it to be more consistent with the other "have we got the
root" checks elsewhere. Is there another way you'd prefer I do it?

-- 
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH



signature.asc
Description: PGP signature


Re: [PATCH v9 05/10] namei: O_BENEATH-style path resolution flags

2019-07-11 Thread Al Viro
On Sun, Jul 07, 2019 at 12:57:32AM +1000, Aleksa Sarai wrote:

> @@ -1442,8 +1464,11 @@ static int follow_dotdot_rcu(struct nameidata *nd)
>   struct inode *inode = nd->inode;
>  
>   while (1) {
> - if (path_equal(>path, >root))
> + if (path_equal(>path, >root)) {
> + if (unlikely(nd->flags & LOOKUP_BENEATH))
> + return -EXDEV;

> @@ -1468,6 +1493,8 @@ static int follow_dotdot_rcu(struct nameidata *nd)
>   return -ECHILD;
>   if (>mnt == nd->path.mnt)
>   break;
> + if (unlikely(nd->flags & LOOKUP_XDEV))
> + return -EXDEV;
>   /* we know that mountpoint was pinned */
>   nd->path.dentry = mountpoint;
>   nd->path.mnt = >mnt;
> @@ -1482,6 +1509,8 @@ static int follow_dotdot_rcu(struct nameidata *nd)
>   return -ECHILD;
>   if (!mounted)
>   break;
> + if (unlikely(nd->flags & LOOKUP_XDEV))
> + return -EXDEV;

Are you sure these failure exits in follow_dotdot_rcu() won't give
suprious hard errors?

> + if (unlikely(nd->flags & LOOKUP_BENEATH)) {
> + error = dirfd_path_init(nd);
> + if (unlikely(error))
> + return ERR_PTR(error);
> + nd->root = nd->path;
> + if (!(nd->flags & LOOKUP_RCU))
> + path_get(>root);
> + }
>   if (*s == '/') {
>   if (likely(!nd->root.mnt))
>   set_root(nd);
> @@ -2350,9 +2400,11 @@ static const char *path_init(struct nameidata *nd, 
> unsigned flags)
>   s = ERR_PTR(error);
>   return s;
>   }
> - error = dirfd_path_init(nd);
> - if (unlikely(error))
> - return ERR_PTR(error);
> + if (likely(!nd->path.mnt)) {

Is that a weird way of saying "if we hadn't already called dirfd_path_init()"?