Re: fallocate() man page - darft 2
On Fri, Aug 03, 2007 at 01:59:53PM +0200, Michael Kerrisk wrote: Amit There is a typo above. We have file system repeated twice in above sentence. Second one should be file. /Amit Thanks for catching that. Okay -- it seems that this page is pretty much ready for publication, right? I'll hold off for a bit, until nearer the end of the 2.6.23 cycle. I agree. Thanks! -- Regards, Amit Arora - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Distributed storage.
On Sun, Aug 05, 2007 at 02:23:45PM -0700, Daniel Phillips ([EMAIL PROTECTED]) wrote: On Sunday 05 August 2007 08:08, Evgeniy Polyakov wrote: If we are sleeping in memory pool, then we already do not have memory to complete previous requests, so we are in trouble. Not at all. Any requests in flight are guaranteed to get the resources they need to complete. This is guaranteed by the combination of memory reserve management and request queue throttling. In logical terms, reserve management plus queue throttling is necessary and sufficient to prevent these deadlocks. Conversely, the absence of either one allows deadlock. Only if you have two, which must be closely related to each other (i.e. each request must have network reserve big enough to store data). This can work for devices which do not require additional allocations (like usual local storage), but not for network connected ones. It works for network devices too, and also for a fancy device like ddsnap, which is the moral equivalent of a filesystem implemented in user space. With or without vm deadlock patches? I can not see how it can work, if network does not have a reserve and there is not free memory completely. If all systems have reserve then yes, it works good. By default things will be like they are now, except additional non-atomic increment and branch in generic_make_request() and decrement and wake in bio_end_io()? -endio is called in interrupt context, so the accounting needs to be atomic as far as I can see. Actually we only care about if there is a place in the queue or not - so it can be a flag. Actually non-atomic operations are ok, since having plus/minus couple of requests in flight does not change the picture, but allows not to introduce slow atomic operations in the fast path. We actually account the total number of bio pages in flight, otherwise you would need to assume the largest possible bio and waste a huge amount of reserve memory. A counting semaphore works fine for this purpose, with some slight inefficiency that is nigh on unmeasurable in the block IO path. What the semaphore does is make the patch small and easy to understand, which is important at this point. Yes, it can be bio vectors. I can cook up such a patch if idea worth efforts. It is. There are some messy details... You need a place to store the accounting variable/semaphore and need to be able to find that place again in -endio. Trickier than it sounds, because of the unstructured way drivers rewrite -bi_bdev. Peterz has already poked at this in a number of different ways, typically involving backing_dev_info, which seems like a good idea to me. We can demand that reserve is not per virtual device, but per real one - for example in case of distributed storage locally connected node should have much higher limit than network one, but having a per-virtual device reserve might end up with situation, when local node can proceed data, but no requests will be queued sine all requests below limit are in network node. In case of per real device limit there is no need to increase bio. -- Evgeniy Polyakov - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Distributed storage.
On Sun, Aug 05, 2007 at 02:35:04PM -0700, Daniel Phillips ([EMAIL PROTECTED]) wrote: On Sunday 05 August 2007 08:01, Evgeniy Polyakov wrote: On Sun, Aug 05, 2007 at 01:06:58AM -0700, Daniel Phillips wrote: DST original code worked as device mapper plugin too, but its two additional allocations (io and clone) per block request ended up for me as a show stopper. Ah, sorry, I misread. A show stopper in terms of efficiency, or in terms of deadlock? At least as in terms of efficiency. Device mapper lives in happy world where memory does not end and allocations are fast. Are you saying that things are different for a network block device because it needs to do GFP_ATOMIC allocations? If so then that is just a misunderstanding. The global page reserve Peter and I use is available in interrupt context just like GFP_ATOMIC. No, neither device needs atomic allocations, I just said that device mapper is too expensive, since it performs alot of additional allocations in the fast path and is not designed to cases when allocation fails, since there is no recovery path and (maybe because of this) mempool allocation waits forever until there is free memory and can not fail. -- Evgeniy Polyakov - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC 20/26] union-mount: Simple union-mount readdir implementation
On Mon, Jul 30, 2007 at 06:13:43PM +0200, Jan Blunck wrote: --- a/include/linux/union.h +++ b/include/linux/union.h @@ -53,6 +53,7 @@ extern void __shrink_d_unions(struct den extern int attach_mnt_union(struct vfsmount *, struct vfsmount *, struct dentry *); extern void detach_mnt_union(struct vfsmount *); +extern int readdir_union(struct file *, void *, filldir_t); #else /* CONFIG_UNION_MOUNT */ @@ -69,5 +70,29 @@ extern void detach_mnt_union(struct vfsm #define detach_mnt_union(x) do { } while (0) #endif /* CONFIG_UNION_MOUNT */ + +static inline int do_readdir(struct file *file, void *buf, filldir_t filler) +{ + struct inode *inode = file-f_path.dentry-d_inode; + int res; + +#ifdef CONFIG_UNION_MOUNT + if (IS_MNT_UNION(file-f_path.mnt)) + res = readdir_union(file, buf, filler); + else +#endif + { + mutex_lock(inode-i_mutex); + res = -ENOENT; + if (!IS_DEADDIR(inode)) { + res = file-f_op-readdir(file, buf, filler); + file_accessed(file); + } + mutex_unlock(inode-i_mutex); + } + + return res; +} Here you are doing readdir_union for all the directories under a union mount point, which is an overhead (building the readdir cache). Here is the fix: From: Bharata B Rao [EMAIL PROTECTED] Within a union mount point, there can be directories which don't have a union stack underneath them. And readdir() doesn't have to maintain a cache of dirents for such directories. But the current patch maintains the cache for such directories. Fix this. Signed-off-by: Bharata B Rao [EMAIL PROTECTED] --- fs/union.c| 17 + include/linux/union.h | 11 ++- 2 files changed, 23 insertions(+), 5 deletions(-) --- a/fs/union.c +++ b/fs/union.c @@ -366,6 +366,23 @@ int follow_union_mount(struct vfsmount * } /* + * is_dir_unioned - check if the directory represented by @mnt and @dentry + * has a union stack underneath. + * + * Returns true if a union stack exists at this directory level, else returns + * false. + */ +int is_dir_unioned(struct vfsmount *mnt, struct dentry *dentry) +{ + struct union_mount *um; + + spin_lock(union_lock); + um = union_lookup(dentry, mnt); + spin_unlock(union_lock); + return um ? 1: 0; +} + +/* * This must be called when unhashing a dentry. This is called with dcache_lock * and unhashes all unions this dentry is in. */ --- a/include/linux/union.h +++ b/include/linux/union.h @@ -54,6 +54,7 @@ extern int attach_mnt_union(struct vfsmo struct dentry *); extern void detach_mnt_union(struct vfsmount *); extern int readdir_union(struct file *, void *, filldir_t); +extern int is_dir_unioned(struct vfsmount *, struct dentry *); extern int union_relookup_topmost(struct nameidata *, int); extern struct dentry *union_create_topmost(struct nameidata *, struct qstr *, struct path *); @@ -82,13 +83,14 @@ extern int union_copyup(struct nameidata static inline int do_readdir(struct file *file, void *buf, filldir_t filler) { - struct inode *inode = file-f_path.dentry-d_inode; int res; + struct inode *inode = file-f_path.dentry-d_inode; #ifdef CONFIG_UNION_MOUNT - if (IS_MNT_UNION(file-f_path.mnt)) - res = readdir_union(file, buf, filler); - else + if (IS_MNT_UNION(file-f_path.mnt)) { + if (is_dir_unioned(file-f_path.mnt, file-f_path.dentry)) + res = readdir_union(file, buf, filler); + } else #endif { mutex_lock(inode-i_mutex); @@ -99,7 +101,6 @@ static inline int do_readdir(struct file } mutex_unlock(inode-i_mutex); } - return res; } Regards, Bharata. - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
swap out memory
Hi, I'm trying to swap out kvm guest pages. The idea is to free some pages when memory pressure is high. kvm has special things to handle like shadow page tables. Before guest page is released, we need free some data, that is guest page has 'private' data, so we can't directly make the guest page swapout with Linux swapout mechanism. I'd like write guest pages to a file or swap. kvm guest pages are in its address space and added into lru list like normal page (the address space is very like a shmem file's, kvm has a memory based file system). When vmscan decided to free one guest page, kvm guest pages's aops.writepage will free the private data and then write it out to block device. The problem is how to write guest pages. I thought we have some choices: 1. swap it to swapfile. Like shmem does, using move_to_swap_cache to move guest page from its address space to swap address space, and finally it's written to swapfile. This method works well, but as kvm is a module, I must export some swap relelated APIs, which Christoph Hellwig dislike. 2. write it to a file. Just like the stack fs does, in kvm address space's .writepage, let low fs's aops write the page. This involves allocating a new page for low fs file and copy kvm page to the new page. As this (doing swap) is done when memory is tight, allocating new page isn't good. The copy isn't good too. 3.write it to a file. Using bmap to get file's block info and using low level sumbit_bio for read/write. This is like what swapfile does, but we do it by ourselves, so don't need use swap symbols. Do you have any suggestion which method is good, or if you have better choice I could try? Thanks, Shaohua - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 03/25] AFFS: call attr_kill_to_mode from affs_notify_change
Signed-off-by: Jeff Layton [EMAIL PROTECTED] --- fs/affs/inode.c |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/fs/affs/inode.c b/fs/affs/inode.c index 4609a6c..1fa2937 100644 --- a/fs/affs/inode.c +++ b/fs/affs/inode.c @@ -218,6 +218,7 @@ affs_notify_change(struct dentry *dentry, struct iattr *attr) pr_debug(AFFS: notify_change(%lu,0x%x)\n,inode-i_ino,attr-ia_valid); + attr_kill_to_mode(inode, attr); error = inode_change_ok(inode,attr); if (error) goto out; -- 1.5.2.2 - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 04/25] AFS: call attr_kill_to_mode from afs_setattr
Not sure if this is necessary or desirable for AFS, but adding it for consistency. Signed-off-by: Jeff Layton [EMAIL PROTECTED] --- fs/afs/inode.c |3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/fs/afs/inode.c b/fs/afs/inode.c index d196840..d0ad5e6 100644 --- a/fs/afs/inode.c +++ b/fs/afs/inode.c @@ -370,6 +370,9 @@ int afs_setattr(struct dentry *dentry, struct iattr *attr) vnode-fid.vid, vnode-fid.vnode, dentry-d_name.name, attr-ia_valid); + /* FIXME: is this necessary? */ + attr_kill_to_mode(dentry-d_inode, attr); + if (!(attr-ia_valid (ATTR_SIZE | ATTR_MODE | ATTR_UID | ATTR_GID | ATTR_MTIME))) { _leave( = 0 [unsupported]); -- 1.5.2.2 - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 05/25] CIFS: call attr_kill_to_mode in cifs_setattr
Signed-off-by: Jeff Layton [EMAIL PROTECTED] --- fs/cifs/inode.c |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/fs/cifs/inode.c b/fs/cifs/inode.c index dd41677..6fee1fa 100644 --- a/fs/cifs/inode.c +++ b/fs/cifs/inode.c @@ -1429,6 +1429,8 @@ int cifs_setattr(struct dentry *direntry, struct iattr *attrs) cifs_sb = CIFS_SB(direntry-d_inode-i_sb); pTcon = cifs_sb-tcon; + attr_kill_to_mode(direntry-d_inode, attrs); + if ((cifs_sb-mnt_cifs_flags CIFS_MOUNT_NO_PERM) == 0) { /* check if we have permission to change attrs */ rc = inode_change_ok(direntry-d_inode, attrs); -- 1.5.2.2 - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 02/25] 9p: have v9fs_vfs_setattr call attr_kill_to_mode
Signed-off-by: Jeff Layton [EMAIL PROTECTED] --- fs/9p/vfs_inode.c |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/fs/9p/vfs_inode.c b/fs/9p/vfs_inode.c index e5c45ee..00fcd4e 100644 --- a/fs/9p/vfs_inode.c +++ b/fs/9p/vfs_inode.c @@ -756,6 +756,7 @@ static int v9fs_vfs_setattr(struct dentry *dentry, struct iattr *iattr) return PTR_ERR(fid); v9fs_blank_wstat(wstat); + attr_kill_to_mode(dentry-d_inode, iattr); if (iattr-ia_valid ATTR_MODE) wstat.mode = unixmode2p9mode(v9ses, iattr-ia_mode); -- 1.5.2.2 - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 08/25] ext234: have setattr functions call attr_kill_to_mode
.. and only save off ia_valid once it returns Signed-off-by: Jeff Layton [EMAIL PROTECTED] --- fs/ext2/inode.c |1 + fs/ext3/inode.c |5 - fs/ext4/inode.c |5 - 3 files changed, 9 insertions(+), 2 deletions(-) diff --git a/fs/ext2/inode.c b/fs/ext2/inode.c index 0079b2c..33dfc48 100644 --- a/fs/ext2/inode.c +++ b/fs/ext2/inode.c @@ -1311,6 +1311,7 @@ int ext2_setattr(struct dentry *dentry, struct iattr *iattr) struct inode *inode = dentry-d_inode; int error; + attr_kill_to_mode(inode, iattr); error = inode_change_ok(inode, iattr); if (error) return error; diff --git a/fs/ext3/inode.c b/fs/ext3/inode.c index de4e316..5ca1085 100644 --- a/fs/ext3/inode.c +++ b/fs/ext3/inode.c @@ -2919,7 +2919,10 @@ int ext3_setattr(struct dentry *dentry, struct iattr *attr) { struct inode *inode = dentry-d_inode; int error, rc = 0; - const unsigned int ia_valid = attr-ia_valid; + unsigned int ia_valid; + + attr_kill_to_mode(inode, attr); + ia_valid = attr-ia_valid; error = inode_change_ok(inode, attr); if (error) diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index a4848e0..9a51e44 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -2935,7 +2935,10 @@ int ext4_setattr(struct dentry *dentry, struct iattr *attr) { struct inode *inode = dentry-d_inode; int error, rc = 0; - const unsigned int ia_valid = attr-ia_valid; + unsigned int ia_valid; + + attr_kill_to_mode(inode, attr); + ia_valid = attr-ia_valid; error = inode_change_ok(inode, attr); if (error) -- 1.5.2.2 - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 06/25] coda: call attr_kill_to_mode from coda_setattr
Signed-off-by: Jeff Layton [EMAIL PROTECTED] --- fs/coda/inode.c |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/fs/coda/inode.c b/fs/coda/inode.c index 342f4e0..c06d031 100644 --- a/fs/coda/inode.c +++ b/fs/coda/inode.c @@ -239,6 +239,7 @@ int coda_setattr(struct dentry *de, struct iattr *iattr) memset(vattr, 0, sizeof(vattr)); inode-i_ctime = CURRENT_TIME_SEC; + attr_kill_to_mode(inode, iattr); coda_iattr_to_vattr(iattr, vattr); vattr.va_type = C_VNON; /* cannot set type */ -- 1.5.2.2 - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 12/25] HPFS: call attr_kill_to_mode from hpfs_notify_change
Signed-off-by: Jeff Layton [EMAIL PROTECTED] --- fs/hpfs/inode.c |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/fs/hpfs/inode.c b/fs/hpfs/inode.c index 85d3e1d..d2b26eb 100644 --- a/fs/hpfs/inode.c +++ b/fs/hpfs/inode.c @@ -265,6 +265,7 @@ int hpfs_notify_change(struct dentry *dentry, struct iattr *attr) struct inode *inode = dentry-d_inode; int error=0; lock_kernel(); + attr_kill_to_mode(inode, attr); if ( ((attr-ia_valid ATTR_SIZE) attr-ia_size inode-i_size) || (hpfs_sb(inode-i_sb)-sb_root == inode-i_ino) ) { error = -EINVAL; -- 1.5.2.2 - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 11/25] hostfs: call attr_kill_to_mode from hostfs_setattr
Signed-off-by: Jeff Layton [EMAIL PROTECTED] --- fs/hostfs/hostfs_kern.c |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/fs/hostfs/hostfs_kern.c b/fs/hostfs/hostfs_kern.c index c778620..ea75204 100644 --- a/fs/hostfs/hostfs_kern.c +++ b/fs/hostfs/hostfs_kern.c @@ -833,6 +833,8 @@ int hostfs_setattr(struct dentry *dentry, struct iattr *attr) int fd = HOSTFS_I(dentry-d_inode)-fd; + attr_kill_to_mode(dentry-d_inode, attr); + err = inode_change_ok(dentry-d_inode, attr); if (err) return err; -- 1.5.2.2 - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 01/25] VFS: move attr_kill logic from notify_change into helper function
Separate the handling of the local ia_valid bitmask from the one in attr-ia_valid. This allows us to hand off the actual handling of the ATTR_KILL_* flags to the .setattr i_op when one is defined. notify_change still needs to process those flags for the local ia_valid variable, since it uses that to decide whether to return early, and to pass a (hopefully) appropriate bitmask to fsnotify_change. Signed-off-by: Jeff Layton [EMAIL PROTECTED] --- fs/attr.c | 54 +-- include/linux/fs.h |1 + 2 files changed, 36 insertions(+), 19 deletions(-) diff --git a/fs/attr.c b/fs/attr.c index f8dfc22..47015e0 100644 --- a/fs/attr.c +++ b/fs/attr.c @@ -100,15 +100,39 @@ int inode_setattr(struct inode * inode, struct iattr * attr) } EXPORT_SYMBOL(inode_setattr); +void attr_kill_to_mode(struct inode *inode, struct iattr *attr) +{ + if (attr-ia_valid ATTR_KILL_SUID) { + attr-ia_valid = ~ATTR_KILL_SUID; + if (inode-i_mode S_ISUID) { + if (!(attr-ia_valid ATTR_MODE)) { + attr-ia_valid |= ATTR_MODE; + attr-ia_mode = inode-i_mode; + } + attr-ia_mode = ~S_ISUID; + } + } + if (attr-ia_valid ATTR_KILL_SGID) { + attr-ia_valid = ~ATTR_KILL_SGID; + if ((inode-i_mode (S_ISGID | S_IXGRP)) == + (S_ISGID | S_IXGRP)) { + if (!(attr-ia_valid ATTR_MODE)) { + attr-ia_valid |= ATTR_MODE; + attr-ia_mode = inode-i_mode; + } + attr-ia_mode = ~S_ISGID; + } + } +} +EXPORT_SYMBOL(attr_kill_to_mode); + int notify_change(struct dentry * dentry, struct iattr * attr) { struct inode *inode = dentry-d_inode; - mode_t mode; int error; struct timespec now; unsigned int ia_valid = attr-ia_valid; - mode = inode-i_mode; now = current_fs_time(inode-i_sb); attr-ia_ctime = now; @@ -117,26 +141,17 @@ int notify_change(struct dentry * dentry, struct iattr * attr) if (!(ia_valid ATTR_MTIME_SET)) attr-ia_mtime = now; if (ia_valid ATTR_KILL_SUID) { - attr-ia_valid = ~ATTR_KILL_SUID; - if (mode S_ISUID) { - if (!(ia_valid ATTR_MODE)) { - ia_valid = attr-ia_valid |= ATTR_MODE; - attr-ia_mode = inode-i_mode; - } - attr-ia_mode = ~S_ISUID; - } + ia_valid = ~ATTR_KILL_SUID; + if (inode-i_mode S_ISUID) + ia_valid |= ATTR_MODE; } if (ia_valid ATTR_KILL_SGID) { - attr-ia_valid = ~ ATTR_KILL_SGID; - if ((mode (S_ISGID | S_IXGRP)) == (S_ISGID | S_IXGRP)) { - if (!(ia_valid ATTR_MODE)) { - ia_valid = attr-ia_valid |= ATTR_MODE; - attr-ia_mode = inode-i_mode; - } - attr-ia_mode = ~S_ISGID; - } + ia_valid = ~ATTR_KILL_SGID; + if ((inode-i_mode (S_ISGID | S_IXGRP)) == + (S_ISGID | S_IXGRP)) + ia_valid |= ATTR_MODE; } - if (!attr-ia_valid) + if (!ia_valid) return 0; if (ia_valid ATTR_SIZE) @@ -147,6 +162,7 @@ int notify_change(struct dentry * dentry, struct iattr * attr) if (!error) error = inode-i_op-setattr(dentry, attr); } else { + attr_kill_to_mode(inode, attr); error = inode_change_ok(inode, attr); if (!error) error = security_inode_setattr(dentry, attr); diff --git a/include/linux/fs.h b/include/linux/fs.h index d33bead..f617b23 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -1561,6 +1561,7 @@ extern int do_remount_sb(struct super_block *sb, int flags, #ifdef CONFIG_BLOCK extern sector_t bmap(struct inode *, sector_t); #endif +extern void attr_kill_to_mode(struct inode *inode, struct iattr *attr); extern int notify_change(struct dentry *, struct iattr *); extern int permission(struct inode *, int, struct nameidata *); extern int generic_permission(struct inode *, int, -- 1.5.2.2 - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 10/25] GFS2: have gfs2_setattr call attr_kill_to_mode
Signed-off-by: Jeff Layton [EMAIL PROTECTED] --- fs/gfs2/ops_inode.c |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/fs/gfs2/ops_inode.c b/fs/gfs2/ops_inode.c index 911c115..8401fc7 100644 --- a/fs/gfs2/ops_inode.c +++ b/fs/gfs2/ops_inode.c @@ -994,6 +994,8 @@ static int gfs2_setattr(struct dentry *dentry, struct iattr *attr) if (IS_IMMUTABLE(inode) || IS_APPEND(inode)) goto out; + attr_kill_to_mode(inode, attr); + error = inode_change_ok(inode, attr); if (error) goto out; -- 1.5.2.2 - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 19/25] reiserfs: call attr_kill_to_mode from reiserfs_setattr
..and only save off ia_valid after it returns Signed-off-by: Jeff Layton [EMAIL PROTECTED] --- fs/reiserfs/inode.c |6 +- 1 files changed, 5 insertions(+), 1 deletions(-) diff --git a/fs/reiserfs/inode.c b/fs/reiserfs/inode.c index ddde489..55b2aea 100644 --- a/fs/reiserfs/inode.c +++ b/fs/reiserfs/inode.c @@ -2896,7 +2896,11 @@ int reiserfs_setattr(struct dentry *dentry, struct iattr *attr) { struct inode *inode = dentry-d_inode; int error; - unsigned int ia_valid = attr-ia_valid; + unsigned int ia_valid; + + attr_kill_to_mode(inode, attr); + ia_valid = attr-ia_valid; + reiserfs_write_lock(inode-i_sb); if (attr-ia_valid ATTR_SIZE) { /* version 2 items will be caught by the s_maxbytes check -- 1.5.2.2 - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 21/25] sysfs: call attr_kill_to_mode from sysfs_setattr
..and only save off ia_valid after it returns Signed-off-by: Jeff Layton [EMAIL PROTECTED] --- fs/sysfs/inode.c |5 - 1 files changed, 4 insertions(+), 1 deletions(-) diff --git a/fs/sysfs/inode.c b/fs/sysfs/inode.c index 10d1b52..38f2ba9 100644 --- a/fs/sysfs/inode.c +++ b/fs/sysfs/inode.c @@ -49,7 +49,7 @@ int sysfs_setattr(struct dentry * dentry, struct iattr * iattr) struct inode * inode = dentry-d_inode; struct sysfs_dirent * sd = dentry-d_fsdata; struct iattr * sd_iattr; - unsigned int ia_valid = iattr-ia_valid; + unsigned int ia_valid; int error; if (!sd) @@ -57,6 +57,9 @@ int sysfs_setattr(struct dentry * dentry, struct iattr * iattr) sd_iattr = sd-s_iattr; + attr_kill_to_mode(inode, iattr); + ia_valid = iattr-ia_valid; + error = inode_change_ok(inode, iattr); if (error) return error; -- 1.5.2.2 - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 23/25] XFS: call attr_kill_to_mode from xfs_vn_setattr
..and only save off ia_valid after it returns Signed-off-by: Jeff Layton [EMAIL PROTECTED] --- fs/xfs/linux-2.6/xfs_iops.c |5 - 1 files changed, 4 insertions(+), 1 deletions(-) diff --git a/fs/xfs/linux-2.6/xfs_iops.c b/fs/xfs/linux-2.6/xfs_iops.c index 0b5fa12..67fba53 100644 --- a/fs/xfs/linux-2.6/xfs_iops.c +++ b/fs/xfs/linux-2.6/xfs_iops.c @@ -651,12 +651,15 @@ xfs_vn_setattr( struct iattr*attr) { struct inode*inode = dentry-d_inode; - unsigned intia_valid = attr-ia_valid; + unsigned intia_valid; bhv_vnode_t *vp = vn_from_inode(inode); bhv_vattr_t vattr = { 0 }; int flags = 0; int error; + attr_kill_to_mode(inode, attr); + ia_valid = attr-ia_valid; + if (ia_valid ATTR_UID) { vattr.va_mask |= XFS_AT_UID; vattr.va_uid = attr-ia_uid; -- 1.5.2.2 - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 25/25] spufs: call attr_kill_to_mode from spufs_setattr
Signed-off-by: Jeff Layton [EMAIL PROTECTED] --- arch/powerpc/platforms/cell/spufs/inode.c |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/arch/powerpc/platforms/cell/spufs/inode.c b/arch/powerpc/platforms/cell/spufs/inode.c index b3d0dd1..d0f6b62 100644 --- a/arch/powerpc/platforms/cell/spufs/inode.c +++ b/arch/powerpc/platforms/cell/spufs/inode.c @@ -97,6 +97,7 @@ spufs_setattr(struct dentry *dentry, struct iattr *attr) { struct inode *inode = dentry-d_inode; + attr_kill_to_mode(inode, attr); if ((attr-ia_valid ATTR_SIZE) (attr-ia_size != inode-i_size)) return -EINVAL; -- 1.5.2.2 - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 20/25] smbfs: call attr_kill_to_mode from smb_notify_change
..seems unlikely that the sb mode would include a setuid bit, but I suppose it's possible. Signed-off-by: Jeff Layton [EMAIL PROTECTED] --- fs/smbfs/inode.c |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/fs/smbfs/inode.c b/fs/smbfs/inode.c index 73d1450..f42d191 100644 --- a/fs/smbfs/inode.c +++ b/fs/smbfs/inode.c @@ -675,6 +675,8 @@ smb_notify_change(struct dentry *dentry, struct iattr *attr) if (error) goto out; + attr_kill_to_mode(inode, attr); + if ((error = inode_change_ok(inode, attr)) 0) goto out; -- 1.5.2.2 - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 13/25] hugetlbfs: call attr_kill_to_mode from hugetlbfs_setattr
..and only set ia_valid after it returns Signed-off-by: Jeff Layton [EMAIL PROTECTED] --- fs/hugetlbfs/inode.c |5 - 1 files changed, 4 insertions(+), 1 deletions(-) diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c index c848a19..3566abb 100644 --- a/fs/hugetlbfs/inode.c +++ b/fs/hugetlbfs/inode.c @@ -340,10 +340,13 @@ static int hugetlbfs_setattr(struct dentry *dentry, struct iattr *attr) { struct inode *inode = dentry-d_inode; int error; - unsigned int ia_valid = attr-ia_valid; + unsigned int ia_valid; BUG_ON(!inode); + attr_kill_to_mode(inode, attr); + ia_valid = attr-ia_valid; + error = inode_change_ok(inode, attr); if (error) goto out; -- 1.5.2.2 - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 17/25] procfs: add attr_kill_to_mode to setattr functions
Not sure if this is really needed, but I don't think it will hurt anything Signed-off-by: Jeff Layton [EMAIL PROTECTED] --- fs/proc/base.c|3 +++ fs/proc/generic.c |3 +++ fs/proc/proc_sysctl.c |3 +++ 3 files changed, 9 insertions(+), 0 deletions(-) diff --git a/fs/proc/base.c b/fs/proc/base.c index 3c77d5a..d2e56ba 100644 --- a/fs/proc/base.c +++ b/fs/proc/base.c @@ -348,6 +348,9 @@ static int proc_setattr(struct dentry *dentry, struct iattr *attr) int error; struct inode *inode = dentry-d_inode; + /* FIXME: is this necessary? */ + attr_kill_to_mode(inode, attr); + if (attr-ia_valid ATTR_MODE) return -EPERM; diff --git a/fs/proc/generic.c b/fs/proc/generic.c index b5e7155..88d736c 100644 --- a/fs/proc/generic.c +++ b/fs/proc/generic.c @@ -239,6 +239,9 @@ static int proc_notify_change(struct dentry *dentry, struct iattr *iattr) struct proc_dir_entry *de = PDE(inode); int error; + /* FIXME: is this necessary? */ + attr_kill_to_mode(inode, iattr); + error = inode_change_ok(inode, iattr); if (error) goto out; diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c index 680c429..37c0cca 100644 --- a/fs/proc/proc_sysctl.c +++ b/fs/proc/proc_sysctl.c @@ -425,6 +425,9 @@ static int proc_sys_setattr(struct dentry *dentry, struct iattr *attr) struct inode *inode = dentry-d_inode; int error; + /* FIXME: is this necessary */ + attr_kill_to_mode(inode, attr); + if (attr-ia_valid (ATTR_MODE | ATTR_UID | ATTR_GID)) return -EPERM; -- 1.5.2.2 - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 16/25] OCFS2: call attr_kill_to_mode from ocfs2_setattr
Signed-off-by: Jeff Layton [EMAIL PROTECTED] --- fs/ocfs2/file.c |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/fs/ocfs2/file.c b/fs/ocfs2/file.c index c4034f6..d05d472 100644 --- a/fs/ocfs2/file.c +++ b/fs/ocfs2/file.c @@ -972,6 +972,8 @@ int ocfs2_setattr(struct dentry *dentry, struct iattr *attr) mlog_entry((0x%p, '%.*s')\n, dentry, dentry-d_name.len, dentry-d_name.name); + attr_kill_to_mode(inode, attr); + if (attr-ia_valid ATTR_MODE) mlog(0, mode change: %d\n, attr-ia_mode); if (attr-ia_valid ATTR_UID) -- 1.5.2.2 - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 00/25] move handling of setuid/gid bits from VFS into individual setattr functions
( Please consider trimming the Cc list if discussing some aspect of this that doesn't concern everyone ) When an unprivileged process attempts to modify a file that has the setuid or setgid bits set, the VFS will attempt to clear these bits. The VFS will set the ATTR_KILL_SUID or ATTR_KILL_SGID bits in the ia_valid mask, and then call notify_change to clear these bits and set the mode accordingly. With a networked filesystem (NFS in particular but most likely others), the client machine may not have credentials that allow for the clearing of these bits. In some situations, this can lead to file corruption, or to an operation failing outright because the setattr fails. In this situation, we'd like to just leave the handling of this to the server and ignore these bits. The problem is that by the time nfs_setattr is called, the VFS has already reinterpreted the ATTR_KILL_* bits into a mode change. We can't fix this in the filesystems where this is a problem, as doing so would leave us having to second-guess what the VFS wants us to do. So we need to change it so that filesystems have more flexibility in how to interpret the ATTR_KILL_* bits. The first patch in the following patchset moves this logic into a helper function, and then only calls this helper function for inodes that do not have a setattr operation defined. The subsequent patches fix up individual filesystem setattr functions to call this helper function. The upshot of this is that with this change, filesystems that define a setattr inode operation are now responsible for handling the ATTR_KILL bits as well. They can trivially do so by calling the helper, but they must do so. Some of the follow-on patches may not be strictly necessary, but I decided that it was better to take the conservative approach and call the helper when I wasn't sure. I've tried to CC the maintainers for the individual filesystems as well where I could find them, please let me know if there are others who should be informed. Comments and suggestions appreciated... Signed-off-by: Jeff Layton [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 15/25] JFS: call attr_kill_to_mode from jfs_setattr
Signed-off-by: Jeff Layton [EMAIL PROTECTED] --- fs/jfs/acl.c |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/fs/jfs/acl.c b/fs/jfs/acl.c index 4d84bdc..34ca314 100644 --- a/fs/jfs/acl.c +++ b/fs/jfs/acl.c @@ -227,6 +227,8 @@ int jfs_setattr(struct dentry *dentry, struct iattr *iattr) struct inode *inode = dentry-d_inode; int rc; + attr_kill_to_mode(inode, iattr); + rc = inode_change_ok(inode, iattr); if (rc) return rc; -- 1.5.2.2 - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 14/25] JFFS2: call attr_kill_to_mode from jffs2_do_setattr
Signed-off-by: Jeff Layton [EMAIL PROTECTED] --- fs/jffs2/fs.c |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/fs/jffs2/fs.c b/fs/jffs2/fs.c index 1d3b7a9..5218f04 100644 --- a/fs/jffs2/fs.c +++ b/fs/jffs2/fs.c @@ -37,6 +37,7 @@ static int jffs2_do_setattr (struct inode *inode, struct iattr *iattr) uint32_t alloclen; int ret; D1(printk(KERN_DEBUG jffs2_setattr(): ino #%lu\n, inode-i_ino)); + attr_kill_to_mode(inode, iattr); ret = inode_change_ok(inode, iattr); if (ret) return ret; -- 1.5.2.2 - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 05/25] CIFS: call attr_kill_to_mode in cifs_setattr
Signed-off-by: Jeff Layton [EMAIL PROTECTED] --- fs/cifs/inode.c |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/fs/cifs/inode.c b/fs/cifs/inode.c index dd41677..6fee1fa 100644 --- a/fs/cifs/inode.c +++ b/fs/cifs/inode.c @@ -1429,6 +1429,8 @@ int cifs_setattr(struct dentry *direntry, struct iattr *attrs) cifs_sb = CIFS_SB(direntry-d_inode-i_sb); pTcon = cifs_sb-tcon; + attr_kill_to_mode(direntry-d_inode, attrs); + if ((cifs_sb-mnt_cifs_flags CIFS_MOUNT_NO_PERM) == 0) { /* check if we have permission to change attrs */ rc = inode_change_ok(direntry-d_inode, attrs); -- 1.5.2.2 - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 00/25] move handling of setuid/gid bits from VFS into individual setattr functions (RESEND)
Apologies for the resend, but the original sending had the date in the email header and it caused some of these to bounce... ( Please consider trimming the Cc list if discussing some aspect of this that doesn't concern everyone.) When an unprivileged process attempts to modify a file that has the setuid or setgid bits set, the VFS will attempt to clear these bits. The VFS will set the ATTR_KILL_SUID or ATTR_KILL_SGID bits in the ia_valid mask, and then call notify_change to clear these bits and set the mode accordingly. With a networked filesystem (NFS in particular but most likely others), the client machine may not have credentials that allow for the clearing of these bits. In some situations, this can lead to file corruption, or to an operation failing outright because the setattr fails. In this situation, we'd like to just leave the handling of this to the server and ignore these bits. The problem is that by the time nfs_setattr is called, the VFS has already reinterpreted the ATTR_KILL_* bits into a mode change. We can't fix this in the filesystems where this is a problem, as doing so would leave us having to second-guess what the VFS wants us to do. So we need to change it so that filesystems have more flexibility in how to interpret the ATTR_KILL_* bits. The first patch in the following patchset moves this logic into a helper function, and then only calls this helper function for inodes that do not have a setattr operation defined. The subsequent patches fix up individual filesystem setattr functions to call this helper function. The upshot of this is that with this change, filesystems that define a setattr inode operation are now responsible for handling the ATTR_KILL bits as well. They can trivially do so by calling the helper, but they must do so. Some of the follow-on patches may not be strictly necessary, but I decided that it was better to take the conservative approach and call the helper when I wasn't sure. I've tried to CC the maintainers for the individual filesystems as well where I could find them, please let me know if there are others who should be informed. Comments and suggestions appreciated... Signed-off-by: Jeff Layton [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 08/25] ext234: have setattr functions call attr_kill_to_mode
.. and only save off ia_valid once it returns Signed-off-by: Jeff Layton [EMAIL PROTECTED] --- fs/ext2/inode.c |1 + fs/ext3/inode.c |5 - fs/ext4/inode.c |5 - 3 files changed, 9 insertions(+), 2 deletions(-) diff --git a/fs/ext2/inode.c b/fs/ext2/inode.c index 0079b2c..33dfc48 100644 --- a/fs/ext2/inode.c +++ b/fs/ext2/inode.c @@ -1311,6 +1311,7 @@ int ext2_setattr(struct dentry *dentry, struct iattr *iattr) struct inode *inode = dentry-d_inode; int error; + attr_kill_to_mode(inode, iattr); error = inode_change_ok(inode, iattr); if (error) return error; diff --git a/fs/ext3/inode.c b/fs/ext3/inode.c index de4e316..5ca1085 100644 --- a/fs/ext3/inode.c +++ b/fs/ext3/inode.c @@ -2919,7 +2919,10 @@ int ext3_setattr(struct dentry *dentry, struct iattr *attr) { struct inode *inode = dentry-d_inode; int error, rc = 0; - const unsigned int ia_valid = attr-ia_valid; + unsigned int ia_valid; + + attr_kill_to_mode(inode, attr); + ia_valid = attr-ia_valid; error = inode_change_ok(inode, attr); if (error) diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index a4848e0..9a51e44 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -2935,7 +2935,10 @@ int ext4_setattr(struct dentry *dentry, struct iattr *attr) { struct inode *inode = dentry-d_inode; int error, rc = 0; - const unsigned int ia_valid = attr-ia_valid; + unsigned int ia_valid; + + attr_kill_to_mode(inode, attr); + ia_valid = attr-ia_valid; error = inode_change_ok(inode, attr); if (error) -- 1.5.2.2 - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 06/25] coda: call attr_kill_to_mode from coda_setattr
Signed-off-by: Jeff Layton [EMAIL PROTECTED] --- fs/coda/inode.c |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/fs/coda/inode.c b/fs/coda/inode.c index 342f4e0..c06d031 100644 --- a/fs/coda/inode.c +++ b/fs/coda/inode.c @@ -239,6 +239,7 @@ int coda_setattr(struct dentry *de, struct iattr *iattr) memset(vattr, 0, sizeof(vattr)); inode-i_ctime = CURRENT_TIME_SEC; + attr_kill_to_mode(inode, iattr); coda_iattr_to_vattr(iattr, vattr); vattr.va_type = C_VNON; /* cannot set type */ -- 1.5.2.2 - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 10/25] GFS2: have gfs2_setattr call attr_kill_to_mode
Signed-off-by: Jeff Layton [EMAIL PROTECTED] --- fs/gfs2/ops_inode.c |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/fs/gfs2/ops_inode.c b/fs/gfs2/ops_inode.c index 911c115..8401fc7 100644 --- a/fs/gfs2/ops_inode.c +++ b/fs/gfs2/ops_inode.c @@ -994,6 +994,8 @@ static int gfs2_setattr(struct dentry *dentry, struct iattr *attr) if (IS_IMMUTABLE(inode) || IS_APPEND(inode)) goto out; + attr_kill_to_mode(inode, attr); + error = inode_change_ok(inode, attr); if (error) goto out; -- 1.5.2.2 - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 09/25] FUSE: Call attr_kill_to_mode from fuse_setattr
Signed-off-by: Jeff Layton [EMAIL PROTECTED] --- fs/fuse/dir.c |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c index bd5a772..60c8d5e 100644 --- a/fs/fuse/dir.c +++ b/fs/fuse/dir.c @@ -1016,6 +1016,8 @@ static int fuse_setattr(struct dentry *entry, struct iattr *attr) int err; int is_truncate = 0; + attr_kill_to_mode(inode, attr); + if (fc-flags FUSE_DEFAULT_PERMISSIONS) { err = inode_change_ok(inode, attr); if (err) -- 1.5.2.2 - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 11/25] hostfs: call attr_kill_to_mode from hostfs_setattr
Signed-off-by: Jeff Layton [EMAIL PROTECTED] --- fs/hostfs/hostfs_kern.c |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/fs/hostfs/hostfs_kern.c b/fs/hostfs/hostfs_kern.c index c778620..ea75204 100644 --- a/fs/hostfs/hostfs_kern.c +++ b/fs/hostfs/hostfs_kern.c @@ -833,6 +833,8 @@ int hostfs_setattr(struct dentry *dentry, struct iattr *attr) int fd = HOSTFS_I(dentry-d_inode)-fd; + attr_kill_to_mode(dentry-d_inode, attr); + err = inode_change_ok(dentry-d_inode, attr); if (err) return err; -- 1.5.2.2 - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 12/25] HPFS: call attr_kill_to_mode from hpfs_notify_change
Signed-off-by: Jeff Layton [EMAIL PROTECTED] --- fs/hpfs/inode.c |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/fs/hpfs/inode.c b/fs/hpfs/inode.c index 85d3e1d..d2b26eb 100644 --- a/fs/hpfs/inode.c +++ b/fs/hpfs/inode.c @@ -265,6 +265,7 @@ int hpfs_notify_change(struct dentry *dentry, struct iattr *attr) struct inode *inode = dentry-d_inode; int error=0; lock_kernel(); + attr_kill_to_mode(inode, attr); if ( ((attr-ia_valid ATTR_SIZE) attr-ia_size inode-i_size) || (hpfs_sb(inode-i_sb)-sb_root == inode-i_ino) ) { error = -EINVAL; -- 1.5.2.2 - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 13/25] hugetlbfs: call attr_kill_to_mode from hugetlbfs_setattr
..and only set ia_valid after it returns Signed-off-by: Jeff Layton [EMAIL PROTECTED] --- fs/hugetlbfs/inode.c |5 - 1 files changed, 4 insertions(+), 1 deletions(-) diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c index c848a19..3566abb 100644 --- a/fs/hugetlbfs/inode.c +++ b/fs/hugetlbfs/inode.c @@ -340,10 +340,13 @@ static int hugetlbfs_setattr(struct dentry *dentry, struct iattr *attr) { struct inode *inode = dentry-d_inode; int error; - unsigned int ia_valid = attr-ia_valid; + unsigned int ia_valid; BUG_ON(!inode); + attr_kill_to_mode(inode, attr); + ia_valid = attr-ia_valid; + error = inode_change_ok(inode, attr); if (error) goto out; -- 1.5.2.2 - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 14/25] JFFS2: call attr_kill_to_mode from jffs2_do_setattr
Signed-off-by: Jeff Layton [EMAIL PROTECTED] --- fs/jffs2/fs.c |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/fs/jffs2/fs.c b/fs/jffs2/fs.c index 1d3b7a9..5218f04 100644 --- a/fs/jffs2/fs.c +++ b/fs/jffs2/fs.c @@ -37,6 +37,7 @@ static int jffs2_do_setattr (struct inode *inode, struct iattr *iattr) uint32_t alloclen; int ret; D1(printk(KERN_DEBUG jffs2_setattr(): ino #%lu\n, inode-i_ino)); + attr_kill_to_mode(inode, iattr); ret = inode_change_ok(inode, iattr); if (ret) return ret; -- 1.5.2.2 - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 17/25] procfs: add attr_kill_to_mode to setattr functions
Not sure if this is really needed, but I don't think it will hurt anything Signed-off-by: Jeff Layton [EMAIL PROTECTED] --- fs/proc/base.c|3 +++ fs/proc/generic.c |3 +++ fs/proc/proc_sysctl.c |3 +++ 3 files changed, 9 insertions(+), 0 deletions(-) diff --git a/fs/proc/base.c b/fs/proc/base.c index 3c77d5a..d2e56ba 100644 --- a/fs/proc/base.c +++ b/fs/proc/base.c @@ -348,6 +348,9 @@ static int proc_setattr(struct dentry *dentry, struct iattr *attr) int error; struct inode *inode = dentry-d_inode; + /* FIXME: is this necessary? */ + attr_kill_to_mode(inode, attr); + if (attr-ia_valid ATTR_MODE) return -EPERM; diff --git a/fs/proc/generic.c b/fs/proc/generic.c index b5e7155..88d736c 100644 --- a/fs/proc/generic.c +++ b/fs/proc/generic.c @@ -239,6 +239,9 @@ static int proc_notify_change(struct dentry *dentry, struct iattr *iattr) struct proc_dir_entry *de = PDE(inode); int error; + /* FIXME: is this necessary? */ + attr_kill_to_mode(inode, iattr); + error = inode_change_ok(inode, iattr); if (error) goto out; diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c index 680c429..37c0cca 100644 --- a/fs/proc/proc_sysctl.c +++ b/fs/proc/proc_sysctl.c @@ -425,6 +425,9 @@ static int proc_sys_setattr(struct dentry *dentry, struct iattr *attr) struct inode *inode = dentry-d_inode; int error; + /* FIXME: is this necessary */ + attr_kill_to_mode(inode, attr); + if (attr-ia_valid (ATTR_MODE | ATTR_UID | ATTR_GID)) return -EPERM; -- 1.5.2.2 - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 18/25] ramfs_nommu: call attr_kill_to_mode from ramfs_nommu_setattr
..and only set old_ia_valid after it returns Signed-off-by: Jeff Layton [EMAIL PROTECTED] --- fs/ramfs/file-nommu.c |5 - 1 files changed, 4 insertions(+), 1 deletions(-) diff --git a/fs/ramfs/file-nommu.c b/fs/ramfs/file-nommu.c index cad2b7a..c354883 100644 --- a/fs/ramfs/file-nommu.c +++ b/fs/ramfs/file-nommu.c @@ -192,9 +192,12 @@ static int ramfs_nommu_resize(struct inode *inode, loff_t newsize, loff_t size) static int ramfs_nommu_setattr(struct dentry *dentry, struct iattr *ia) { struct inode *inode = dentry-d_inode; - unsigned int old_ia_valid = ia-ia_valid; + unsigned int old_ia_valid; int ret = 0; + attr_kill_to_mode(inode, ia); + old_ia_valid = ia-ia_valid; + /* POSIX UID/GID verification for setting inode attributes */ ret = inode_change_ok(inode, ia); if (ret) -- 1.5.2.2 - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 19/25] reiserfs: call attr_kill_to_mode from reiserfs_setattr
..and only save off ia_valid after it returns Signed-off-by: Jeff Layton [EMAIL PROTECTED] --- fs/reiserfs/inode.c |6 +- 1 files changed, 5 insertions(+), 1 deletions(-) diff --git a/fs/reiserfs/inode.c b/fs/reiserfs/inode.c index ddde489..55b2aea 100644 --- a/fs/reiserfs/inode.c +++ b/fs/reiserfs/inode.c @@ -2896,7 +2896,11 @@ int reiserfs_setattr(struct dentry *dentry, struct iattr *attr) { struct inode *inode = dentry-d_inode; int error; - unsigned int ia_valid = attr-ia_valid; + unsigned int ia_valid; + + attr_kill_to_mode(inode, attr); + ia_valid = attr-ia_valid; + reiserfs_write_lock(inode-i_sb); if (attr-ia_valid ATTR_SIZE) { /* version 2 items will be caught by the s_maxbytes check -- 1.5.2.2 - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 25/25] spufs: call attr_kill_to_mode from spufs_setattr
Signed-off-by: Jeff Layton [EMAIL PROTECTED] --- arch/powerpc/platforms/cell/spufs/inode.c |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/arch/powerpc/platforms/cell/spufs/inode.c b/arch/powerpc/platforms/cell/spufs/inode.c index b3d0dd1..d0f6b62 100644 --- a/arch/powerpc/platforms/cell/spufs/inode.c +++ b/arch/powerpc/platforms/cell/spufs/inode.c @@ -97,6 +97,7 @@ spufs_setattr(struct dentry *dentry, struct iattr *attr) { struct inode *inode = dentry-d_inode; + attr_kill_to_mode(inode, attr); if ((attr-ia_valid ATTR_SIZE) (attr-ia_size != inode-i_size)) return -EINVAL; -- 1.5.2.2 - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 02/25] 9p: have v9fs_vfs_setattr call attr_kill_to_mode
Signed-off-by: Jeff Layton [EMAIL PROTECTED] --- fs/9p/vfs_inode.c |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/fs/9p/vfs_inode.c b/fs/9p/vfs_inode.c index e5c45ee..00fcd4e 100644 --- a/fs/9p/vfs_inode.c +++ b/fs/9p/vfs_inode.c @@ -756,6 +756,7 @@ static int v9fs_vfs_setattr(struct dentry *dentry, struct iattr *iattr) return PTR_ERR(fid); v9fs_blank_wstat(wstat); + attr_kill_to_mode(dentry-d_inode, iattr); if (iattr-ia_valid ATTR_MODE) wstat.mode = unixmode2p9mode(v9ses, iattr-ia_mode); -- 1.5.2.2 - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [fuse-devel] [PATCH 01/25] VFS: move attr_kill logic from notify_change into helper function
I agree with this change and fuse will make use of it as well. Maybe instead of unconditionally moving attr_kill_to_mode() inside -setattr() it could be made conditional based on an inode flag similarly to S_NOCMTIME. Advantages: - no need to modify a lot of in-tree filesystems - no silent breakage of out-of-tree fs Actually I think the new flag would be used by exacly the same filesystems as S_NOCMTIME, so maybe it would make sense to rename S_NOCMTIME to something more generic (S_NOATTRUPDATE or whatever) and use that. But that could still break out-of-tree fs, so a separate flag is probably better. In the past I've been told that adding new flags is something of a last resort. Since it's not strictly necessary to fix this then it may be best to avoid that. That said, if the concensus is that we need a transition mechanism, then I'd be open to such a suggestion. I think there's really no other choice here. Your patch is changing the API in a very unsafe way, since there will be no error or warning on an unconverted fs. And that could lead to security holes. If we would rename the setattr method to setattr_new as well as changing it's behavior, that would be fine. But I guess we do not want to do that. Miklos - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [fuse-devel] [PATCH 01/25] VFS: move attr_kill logic from notify_change into helper function
On Mon, 2007-08-06 at 20:28 +0200, Miklos Szeredi wrote: Your patch is changing the API in a very unsafe way, since there will be no error or warning on an unconverted fs. And that could lead to security holes. If we would rename the setattr method to setattr_new as well as changing it's behavior, that would be fine. But I guess we do not want to do that. Which unconverted fses? If we're talking out of tree stuff, then too bad: it is _their_ responsibility to keep up with kernel changes. Trond - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [fuse-devel] [PATCH 01/25] VFS: move attr_kill logic from notify_change into helper function
Your patch is changing the API in a very unsafe way, since there will be no error or warning on an unconverted fs. And that could lead to security holes. If we would rename the setattr method to setattr_new as well as changing it's behavior, that would be fine. But I guess we do not want to do that. Which unconverted fses? If we're talking out of tree stuff, then too bad: it is _their_ responsibility to keep up with kernel changes. It is usually a good idea to not change the semantics of an API in a backward incompatible way without changing the syntax as well. This is true regardless of whether we care about out-of-tree code or not (and we should care to some degree). And especially true if the change in question is security sensitive. Miklos - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [fuse-devel] [PATCH 01/25] VFS: move attr_kill logic from notify_change into helper function
On Mon, 2007-08-06 at 21:37 +0200, Miklos Szeredi wrote: Your patch is changing the API in a very unsafe way, since there will be no error or warning on an unconverted fs. And that could lead to security holes. If we would rename the setattr method to setattr_new as well as changing it's behavior, that would be fine. But I guess we do not want to do that. Which unconverted fses? If we're talking out of tree stuff, then too bad: it is _their_ responsibility to keep up with kernel changes. It is usually a good idea to not change the semantics of an API in a backward incompatible way without changing the syntax as well. We're taking two setattr flags ATTR_KILL_SGID, and ATTR_KILL_SUID which have existed for several years in the VFS, and making them visible to the filesystems. Out-of-tree filesystems that care can check for them in a completely backward compatible way: you don't even need to add a #define. This is true regardless of whether we care about out-of-tree code or not (and we should care to some degree). And especially true if the change in question is security sensitive. It is not true regardless: the in-tree code is being converted. Out-of-tree code is the only problem here, and their only problem is that they may have to add support for the new flags if they also support suid/sgid mode bits. Are you advocating reserving a new filesystem bit every time we need to add an attribute flag? Trond - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [EXT4 set 8][PATCH 1/1]Add journal checksums
On Wed, 2007-08-01 at 12:34 +0530, Girish Shilamkar wrote: On Wed, 2007-07-11 at 17:16 +0530, Girish Shilamkar wrote: I will make the changes and send an incremental patch. Hi, I have made the changes and attached the incremental patch as per the review. Thanks, I merged your changes to ext4-patch-queue. Mingming - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[patch][rfc] fs: fix nobh error handling
Hi, So I seem to have hit some resistance to the idea of removing nobh mode with the new aops patches. I don't need to reiterate my ideas for disliking nobh mode (if the buffer layer is crap, it should be improved rather than circumvented -- see fsblock). But at any rate, this view does not seem to be unanimous, and it is completely sensible to be adverse to removing things that exist and somewhat work today. So I've been mulling around the best way to have nobh and new aoips coexist nicely. I think I've got a reasonable idea now... the main problem with nobh mode is that it blissfully throws away most of the buffer state; when an error does happen, it's up shit creek. This becomes more of a problem with error recovery after a short write allowed by the new aop API: the short write is actually going to happen relatively often as part of normal operation, rather than just when the disk subsystem fails. Still, the fix for existing error handling will look basically the same as the fix for short write error handling. So I propose this patch against head in order not to confuse the issues (and to fix the existing bugs in release kernels). If this gets accepted, then I can subsequently look at doing _additional_ patches for the new aops series in -mm. Note: I actually did observe some of these problems by using IO error injection. I wouldn't say the patch is completely stress tested yet, but it did survive the same tests without any file corruption. Nick --- nobh mode error handling is not just pretty slack, it's wrong. One cannot zero out the whole page to ensure new blocks are zeroed, because it just brings the whole page uptodate with zeroes even if that may not be the correct uptodate data. Also, other parts of the page may already contain dirty data which would get lost by zeroing it out. Thirdly, the writeback of zeroes to the new blocks will also erase existing blocks. All these conditions are pagecache and/or filesystem corruption. The problem comes about because we didn't keep track of which buffers actually are new or old. However it is not enough just to keep only this state, because at the point we start dirtying parts of the page (new blocks, with zeroes), the handling of IO errors becomes impossible without buffers because the page may only be partially uptodate, in which case the page flags allone cannot capture the state of the parts of the page. So allocate all buffers for the page upfront, but leave them unattached so that they don't pick up any other references and can be freed when we're done. If the error path is hit, then zero the new buffers as the regular buffer path does, then attach the buffers to the page so that it can actually be written out correctly and be subject to the normal IO error handling paths. As an upshot, we save 1K of kernel stack on ia64 or powerpc 64K page systems. Signed-off-by: Nick Piggin [EMAIL PROTECTED] -- Index: linux-2.6/fs/buffer.c === --- linux-2.6.orig/fs/buffer.c +++ linux-2.6/fs/buffer.c @@ -2272,51 +2272,53 @@ int nobh_prepare_write(struct page *page struct inode *inode = page-mapping-host; const unsigned blkbits = inode-i_blkbits; const unsigned blocksize = 1 blkbits; - struct buffer_head map_bh; - struct buffer_head *read_bh[MAX_BUF_PER_PAGE]; + struct buffer_head *head, *bh; unsigned block_in_page; - unsigned block_start; + unsigned block_start, block_end; sector_t block_in_file; char *kaddr; int nr_reads = 0; - int i; int ret = 0; int is_mapped_to_disk = 1; + if (PagePrivate(page)) + return block_prepare_write(page, from, to, get_block); + if (PageMappedToDisk(page)) return 0; + head = alloc_page_buffers(page, blocksize, 1); + block_in_file = (sector_t)page-index (PAGE_CACHE_SHIFT - blkbits); - map_bh.b_page = page; /* * We loop across all blocks in the page, whether or not they are * part of the affected region. This is so we can discover if the * page is fully mapped-to-disk. */ - for (block_start = 0, block_in_page = 0; + for (block_start = 0, block_in_page = 0, bh = head; block_start PAGE_CACHE_SIZE; - block_in_page++, block_start += blocksize) { - unsigned block_end = block_start + blocksize; + block_in_page++, block_start += blocksize, bh = bh-b_this_page) { int create; - map_bh.b_state = 0; + block_end = block_start + blocksize; + bh-b_state = 0; create = 1; if (block_start = to) create = 0; - map_bh.b_size = blocksize; ret = get_block(inode, block_in_file + block_in_page, - map_bh,