Re: [nameidata 1/2] Don't pass NULL nameidata to vfs_create

2007-05-11 Thread Andreas Gruenbacher
On Monday 16 April 2007 18:21, Christoph Hellwig wrote:
 On Mon, Apr 16, 2007 at 06:11:30PM +0200, Andreas Gruenbacher wrote:
  On Thursday 12 April 2007 12:06, Christoph Hellwig wrote:
   Once again very strong NACK.  Every conditional passing of vfsmounts get 
   my veto.  As mentioned last time if you really want this send a patch
   series first that passed the vfsmount consistantly.
  
  I don't consider it fair to NACK this patch just because some other part
  of the kernel uses vfs_create() in the wrong way -- those are really
  independent issues. The bug is not that hard to fix though; here is a
  proposed patch on top of the other AppArmor patches.
 
 Note that it's not just this particular hunk, all these kinds of conditional
 passing are wrong.

Yes, the vfs uses or passes through a NULL nameidata or vfsmount pointer and 
no intent information in several places:

 (1) vfs_create() is called with a NULL nameidata in two places,

 (2) lookup_one_len() and lookup_one_len_kern() pass a NULL nameidata to
 permission(), d_op-d_revalidate(), and i_op-lookup(),

 (3) file_permission() passes a NULL nameidata to permission().

 (4) permission() is directly called with NULL nameidata in some places, and

In general it is incorrect to use NULL nameidata or vfsmounts because then the
vfsmount-mnt_flags cannot be checked, and no intent information is available. 
Intents are an optional optimization for remote filesystems; we can simply 
ignore them for local filesystems. There are some cases where we are passing 
NULL which are real bugs, but there are also other cases in which the 
operations are not mount related: for example, filesystems like pipefs have 
the MS_NOUSER flag set which indicates that they cannot be mounted at all. On 
filesystems like sysfs, files are created and removed whether or not that 
filesystem is mounted.

The places where we should pass a proper nameidata / vfsmount but don't should 
be fixed. Those are long-standing issues in the vfs though, and at least some 
are not easy to correct. AppArmor is affected by some of the NULL vfsmount 
passing, and we found a few more problems when looking into this more 
closely, but most of that NULL passing doesn't affect AppArmor:

 * In some places, vfs functions are called without nameidata / vfsmount,
   followed by calling lsm hooks with a vfsmount. In those cases, the vfs
   functions cannot check the vfsmount flags, but the lsm hooks will still
   do the appropriate checks.

 * In the AppArmor security model, directory searching is always allowed, and
   the access check are done once the leaf dentry + vfsmount has been looked
   up. For example, granting write access to /var/tmp/foo in a profile is
   sufficient to allow that access; search access to /var and /var/tmp does
   not need to be specified. (The regular inode permission checks are of
   course still performed.)

 * For filesystems like nfsd, the concept of pathname based access control
   doesn't make sense because of properties of the protocol: there is no
   reliable mapping from an nfs file handle to a particular file; aliases
   to the same file cannot be distinguished. (The AppArmor techdoc [1]
   describes this in more detail.)

   Not allowing to confine nfsd by AppArmor doesn't hurt much: nfsd provides
   its own export access control mechanisms. Also, nfsd cannot really be
   confined in a secure way because of how it is implemented in the kernel.
   Gfs2 may have similar properties; I didn't actually look into the protocol.

So I propose to separate the vfs NULL nameidata / vfsmount passing discussion 
from the AppArmor discussion. I have no problem working on the vfs fixes -- 
we could some nice cleanups there -- but that really is independent of 
AppArmor.

We almost have the revised AppArmor patches ready. We'll include all the fixes 
that are truly necessary, and everything beyond that will be posted 
separately.

 But anyway, creating fake nameidata structures is not really helpful.
 If there is a nameidata passed people expect it to be complete, and
 if you pass them to an LSM people will e.g. try to look into lookup
 intents.

Splitting up nameidata helps somewhat here. In cases where we don't have 
intent information available we can zero out the intent flags, and so in the 
worst case, we won't get the intent optimizations.

Thanks,
Andreas
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [nameidata 1/2] Don't pass NULL nameidata to vfs_create

2007-04-17 Thread Andreas Gruenbacher
On Monday 16 April 2007 18:45, Christoph Hellwig wrote:
 You should provide intent information, yes - which your patch didn't :)

Well, the information implicitly provided is no intent: In do_create() in
ipc/mqueue.c intents would be pretty pointless because the mqueue filesystem
is local. In fs/nfsd/vfs.c, intents would make slightly more sense assuming
that the underlying filesystem eported via nfsd is remote. That's an
optimization, and not even a very important one.

 (Which btw, I expect to cause quite a few problems for apparmor or other
 lsms, but I guess so far no one has tried them on NFSv4)

Pathname wise, NFSv4 should look like any other filesystem on the client side. 
On the Server side, the concept of pathnames doesn't really fly for nfs: if a 
directory contains more than one link to the same file, there is no way to 
tell those aliases from each other from the file descriptor. In addition, 
computing even the somewhat ambiguous pathnames that can be computed would
require subtree checking. But trying to confine nfsd is pretty pointless 
anyway: the deamon is privileged and can do whatever it wants. It makes more
sense to export the right directories with the right permissions in the first
place.

Thanks,
Andreas
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[nameidata 1/2] Don't pass NULL nameidata to vfs_create

2007-04-16 Thread Andreas Gruenbacher
On Thursday 12 April 2007 12:06, Christoph Hellwig wrote:
 Once again very strong NACK.  Every conditional passing of vfsmounts get my
 veto.  As mentioned last time if you really want this send a patch series
 first that passed the vfsmount consistantly.

I don't consider it fair to NACK this patch just because some other part of 
the kernel uses vfs_create() in the wrong way -- those are really independent 
issues. The bug is not that hard to fix though; here is a proposed patch on 
top of the other AppArmor patches.

The offenders are nfsd and the mqueue filesystem. Instantiate a struct 
nameidata there.

The .dentry and .mnt fields can easily be filled in in nfsd. The mqueue 
filesystem is somewhat special: files that mq_open creates are on an internal 
mount and the mqueue filesystem is not generally visible (it may or may not 
be mounted). When passing a regular vfsmount to vfs_create the files would 
appear disconnected while they actually are kernel-internal objects. Use a 
NULL vfsmount there to avoid that -- passing the kernel-internal vfsmount 
there wouldn't help, anyway.

Signed-off-by: Andreas Gruenbacher [EMAIL PROTECTED]

---
 fs/namei.c|7 ---
 fs/nfsd/vfs.c |   23 +++
 ipc/mqueue.c  |6 +-
 3 files changed, 28 insertions(+), 8 deletions(-)

--- a/fs/namei.c
+++ b/fs/namei.c
@@ -1503,7 +1503,7 @@ int vfs_create(struct inode *dir, struct
return -EACCES; /* shouldn't it be ENOSYS? */
mode = S_IALLUGO;
mode |= S_IFREG;
-   error = security_inode_create(dir, dentry, nd ? nd-mnt : NULL, mode);
+   error = security_inode_create(dir, dentry, nd-mnt, mode);
if (error)
return error;
DQUOT_INIT(dir);
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -1108,6 +1108,18 @@ nfsd_commit(struct svc_rqst *rqstp, stru
 }
 #endif /* CONFIG_NFSD_V3 */
 
+static inline int
+nfsd_do_create(struct inode *dir, struct dentry *child, struct vfsmount *mnt,
+  int mode)
+{
+   struct nameidata nd = {
+   .dentry = child,
+   .mnt = mnt,
+   };
+
+   return vfs_create(dir, child, mode, nd);
+}
+
 /*
  * Create a file (regular, directory, device, fifo); UNIX sockets 
  * not yet implemented.
@@ -1192,7 +1204,8 @@ nfsd_create(struct svc_rqst *rqstp, stru
err = 0;
switch (type) {
case S_IFREG:
-   host_err = vfs_create(dirp, dchild, iap-ia_mode, NULL);
+   host_err = nfsd_do_create(dirp, dchild, exp-ex_mnt,
+ iap-ia_mode);
break;
case S_IFDIR:
host_err = vfs_mkdir(dirp, dchild, exp-ex_mnt, iap-ia_mode);
@@ -1253,6 +1266,7 @@ nfsd_create_v3(struct svc_rqst *rqstp, s
int *truncp, int *created)
 {
struct dentry   *dentry, *dchild = NULL;
+   struct svc_export *exp;
struct inode*dirp;
__be32  err;
int host_err;
@@ -1271,6 +1285,7 @@ nfsd_create_v3(struct svc_rqst *rqstp, s
goto out;
 
dentry = fhp-fh_dentry;
+   exp = fhp-fh_export;
dirp = dentry-d_inode;
 
/* Get all the sanity checks out of the way before
@@ -1288,7 +1303,7 @@ nfsd_create_v3(struct svc_rqst *rqstp, s
if (IS_ERR(dchild))
goto out_nfserr;
 
-   err = fh_compose(resfhp, fhp-fh_export, dchild, fhp);
+   err = fh_compose(resfhp, exp, dchild, fhp);
if (err)
goto out;
 
@@ -1334,13 +1349,13 @@ nfsd_create_v3(struct svc_rqst *rqstp, s
goto out;
}
 
-   host_err = vfs_create(dirp, dchild, iap-ia_mode, NULL);
+   host_err = nfsd_do_create(dirp, dchild, exp-ex_mnt, iap-ia_mode);
if (host_err  0)
goto out_nfserr;
if (created)
*created = 1;
 
-   if (EX_ISSYNC(fhp-fh_export)) {
+   if (EX_ISSYNC(exp)) {
err = nfserrno(nfsd_sync_dir(dentry));
/* setattr will sync the child (or not) */
}
--- a/ipc/mqueue.c
+++ b/ipc/mqueue.c
@@ -604,6 +604,10 @@ static int mq_attr_ok(struct mq_attr *at
 static struct file *do_create(struct dentry *dir, struct dentry *dentry,
int oflag, mode_t mode, struct mq_attr __user *u_attr)
 {
+   struct nameidata nd = {
+   .dentry = dentry,
+   /* Not a regular create, so set .mnt to NULL. */
+   };
struct mq_attr attr;
int ret;
 
@@ -619,7 +623,7 @@ static struct file *do_create(struct den
}
 
mode = ~current-fs-umask;
-   ret = vfs_create(dir-d_inode, dentry, mode, NULL);
+   ret = vfs_create(dir-d_inode, dentry, mode, nd);
dentry-d_fsdata = NULL;
if (ret)
goto out;
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [nameidata 1/2] Don't pass NULL nameidata to vfs_create

2007-04-16 Thread Matthew Wilcox
On Mon, Apr 16, 2007 at 06:11:30PM +0200, Andreas Gruenbacher wrote:
 +static inline int
 +nfsd_do_create(struct inode *dir, struct dentry *child, struct vfsmount *mnt,
 +int mode)
 +{
 + struct nameidata nd = {
 + .dentry = child,
 + .mnt = mnt,
 + };
 +
 + return vfs_create(dir, child, mode, nd);
 +}
 +

Wouldn't it normally result in fewer instructions (on most architectures
... maybe not x86) to keep the same argument order as vfs_create?  ie:

static inline int nfsd_do_create(struct inode *dir, struct dentry *child,
 int mode, struct vfsmount *mnt)

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [nameidata 1/2] Don't pass NULL nameidata to vfs_create

2007-04-16 Thread Andreas Gruenbacher
On Monday 16 April 2007 18:21, Christoph Hellwig wrote:
 But anyway, creating fake nameidata structures is not really helpful.
 If there is a nameidata passed people expect it to be complete, and
 if you pass them to an LSM people will e.g. try to look into lookup
 intents.

I don't actually agree with that: when nfsd creates a file, it still is a file 
create no matter where it originates from, and so it does make sense to 
provide the appropriate intent information too. Struct nameidata contains 
other crap only needed during an actual lookup too --- that's a mess, and the 
nameidata2 patch shows one possibility how to deal with it. (It's the 
cleanest approach I could think of right now without cluttering the vfs code 
with insane amounts of dereferences.)

Thanks,
Andreas
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [nameidata 1/2] Don't pass NULL nameidata to vfs_create

2007-04-16 Thread Christoph Hellwig
On Mon, Apr 16, 2007 at 06:40:41PM +0200, Andreas Gruenbacher wrote:
 On Monday 16 April 2007 18:21, Christoph Hellwig wrote:
  But anyway, creating fake nameidata structures is not really helpful.
  If there is a nameidata passed people expect it to be complete, and
  if you pass them to an LSM people will e.g. try to look into lookup
  intents.
 
 I don't actually agree with that: when nfsd creates a file, it still is a 
 file 
 create no matter where it originates from, and so it does make sense to 
 provide the appropriate intent information too. Struct nameidata contains 
 other crap only needed during an actual lookup too --- that's a mess, and the 

You should provide intent information, yes - which your patch didn't :)

And yes, I didn't like doing the ugly intent in nameidata hack, it's
creating loads of problems for various parties, e.g. the stackable
filesystem folks.  Now the basic intent in nameidata mistaken has been
made even worse by passing back a struct file in it conditionally and
doing lots of work in -lookup that shouldn't be there.  (Which btw,
I expect to cause quite a few problems for apparmor or other lsms,
but I guess so far no one has tried them on NFSv4)

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html