Re: 2.6.23-mm1: BUG in reiserfs_delete_xattrs

2007-10-15 Thread Christoph Hellwig
On Mon, Oct 15, 2007 at 12:34:58AM +0200, Laurent Riffard wrote:
 reiserfs_delete_xattrs
 reiserfs_delete_inode
 generic_delete_inode
 generic_drop_inode
 iput
 do_unlinkat
 sys_unlink
 sys_enter_past_esp
 
 I reported a similar BUG in 2.6.22-rc8-mm2 (see
 http://lkml.org/lkml/2007/9/27/235). Dave Hansen sent a patch for it, I
 tested it and it was OK for 2.6.22-rc8-mm2.
 
 I tried this patch on 2.6.23-mm1, and it fixed the BUGs here too.

The delete path is a similar case as the one Dave fixed, also cause by
a NULL vfsmount passed to dentry_open, but through a different code-path.

Untested fix for this problem below:

Index: linux-2.6.23-rc8/fs/reiserfs/xattr.c
===
--- linux-2.6.23-rc8.orig/fs/reiserfs/xattr.c   2007-09-30 14:13:46.0 
+0200
+++ linux-2.6.23-rc8/fs/reiserfs/xattr.c2007-09-30 14:18:30.0 
+0200
@@ -207,9 +207,8 @@ static struct dentry *get_xa_file_dentry
  * we're called with i_mutex held, so there are no worries about the directory
  * changing underneath us.
  */
-static int __xattr_readdir(struct file *filp, void *dirent, filldir_t filldir)
+static int __xattr_readdir(struct inode *inode, void *dirent, filldir_t 
filldir)
 {
-   struct inode *inode = filp-f_path.dentry-d_inode;
struct cpu_key pos_key; /* key of current position in the directory 
(key of directory entry) */
INITIALIZE_PATH(path_to_entry);
struct buffer_head *bh;
@@ -352,24 +351,19 @@ static int __xattr_readdir(struct file *
  * this is stolen from vfs_readdir
  *
  */
-static
-int xattr_readdir(struct file *file, filldir_t filler, void *buf)
+static int xattr_readdir(struct inode *inode, filldir_t filler, void *buf)
 {
-   struct inode *inode = file-f_path.dentry-d_inode;
int res = -ENOTDIR;
-   if (!file-f_op || !file-f_op-readdir)
-   goto out;
+
mutex_lock_nested(inode-i_mutex, I_MUTEX_XATTR);
-//down(inode-i_zombie);
res = -ENOENT;
if (!IS_DEADDIR(inode)) {
lock_kernel();
-   res = __xattr_readdir(file, buf, filler);
+   res = __xattr_readdir(inode, buf, filler);
unlock_kernel();
}
-//up(inode-i_zombie);
mutex_unlock(inode-i_mutex);
-  out:
+
return res;
 }
 
@@ -721,7 +715,6 @@ reiserfs_delete_xattrs_filler(void *buf,
 /* This is called w/ inode-i_mutex downed */
 int reiserfs_delete_xattrs(struct inode *inode)
 {
-   struct file *fp;
struct dentry *dir, *root;
int err = 0;
 
@@ -742,15 +735,8 @@ int reiserfs_delete_xattrs(struct inode 
return 0;
}
 
-   fp = dentry_open(dir, NULL, O_RDWR);
-   if (IS_ERR(fp)) {
-   err = PTR_ERR(fp);
-   /* dentry_open dputs the dentry if it fails */
-   goto out;
-   }
-
lock_kernel();
-   err = xattr_readdir(fp, reiserfs_delete_xattrs_filler, dir);
+   err = xattr_readdir(dir-d_inode, reiserfs_delete_xattrs_filler, dir);
if (err) {
unlock_kernel();
goto out_dir;
@@ -770,7 +756,7 @@ int reiserfs_delete_xattrs(struct inode 
unlock_kernel();
 
   out_dir:
-   fput(fp);
+   dput(dir);
 
   out:
if (!err)
@@ -812,7 +798,6 @@ reiserfs_chown_xattrs_filler(void *buf, 
 
 int reiserfs_chown_xattrs(struct inode *inode, struct iattr *attrs)
 {
-   struct file *fp;
struct dentry *dir;
int err = 0;
struct reiserfs_chown_buf buf;
@@ -836,13 +821,6 @@ int reiserfs_chown_xattrs(struct inode *
goto out;
}
 
-   fp = dentry_open(dir, NULL, O_RDWR);
-   if (IS_ERR(fp)) {
-   err = PTR_ERR(fp);
-   /* dentry_open dputs the dentry if it fails */
-   goto out;
-   }
-
lock_kernel();
 
attrs-ia_valid = (ATTR_UID | ATTR_GID | ATTR_CTIME);
@@ -850,7 +828,7 @@ int reiserfs_chown_xattrs(struct inode *
buf.attrs = attrs;
buf.inode = inode;
 
-   err = xattr_readdir(fp, reiserfs_chown_xattrs_filler, buf);
+   err = xattr_readdir(dir-d_inode, reiserfs_chown_xattrs_filler, buf);
if (err) {
unlock_kernel();
goto out_dir;
@@ -860,7 +838,7 @@ int reiserfs_chown_xattrs(struct inode *
unlock_kernel();
 
   out_dir:
-   fput(fp);
+   dput(dir);
 
   out:
attrs-ia_valid = ia_valid;
@@ -1008,7 +986,6 @@ reiserfs_listxattr_filler(void *buf, con
  */
 ssize_t reiserfs_listxattr(struct dentry * dentry, char *buffer, size_t size)
 {
-   struct file *fp;
struct dentry *dir;
int err = 0;
struct reiserfs_listxattr_buf buf;
@@ -1031,13 +1008,6 @@ ssize_t reiserfs_listxattr(struct dentry
goto out;
}
 
-   fp = dentry_open(dir, NULL, O_RDWR);
-   if (IS_ERR(fp)) {
-   err = PTR_ERR(fp);
-   /* 

Re: msync(2) bug(?), returns AOP_WRITEPAGE_ACTIVATE to userland

2007-10-15 Thread Pekka Enberg
Hi,

On 10/15/07, Erez Zadok [EMAIL PROTECTED] wrote:
 Pekka, with a small change to your patch (to handle time-based cache
 coherency), your patch worked well and passed all my tests.  Thanks.

 So now I wonder if we still need the patch to prevent AOP_WRITEPAGE_ACTIVATE
 from being returned to userland.  I guess we still need it, b/c even with
 your patch, generic_writepages() can return AOP_WRITEPAGE_ACTIVATE back to
 the VFS and we need to ensure that doesn't leak outside the kernel.

I wonder whether _not setting_ BDI_CAP_NO_WRITEBACK implies that
-writepage() will never return AOP_WRITEPAGE_ACTIVATE for
!wbc-for_reclaim case which would explain why we haven't hit this bug
before. Hugh, Andrew?

And btw, I think we need to fix ecryptfs too.

   Pekka
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: XFS regression?

2007-10-15 Thread David Chinner
On Mon, Oct 15, 2007 at 03:28:34PM +0530, Bhagi rathi wrote:
 Thanks Dave for the response.   Thinking futher, why is that xfs_iunpin has
 to mark the inode dirty?

Because the inode has been modified, and instead of sprinkling
mark_inode_dirty_sync() all over the code, we can do it in a single
spot that catches all inode modifications. We don't have to think about
it by doing this - inodes in transactions get marked dirty for free

 All transactions generally modify one time or other, xfs_ichgtime takes care
 of marking inode as
 dirty.

Sure, but there's plenty of other transactions that don't have such
a convenient hook.

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Correct behavior on O_DIRECT sparse file writes

2007-10-15 Thread Chuck Lever

Florian Weimer wrote:

* Andrew Morton:


I don't think it's a bug.  Sure, O_DIRECT is synchronous, but that's
because it is, err, direct.  Not because it provides extra data-integrity
guarantees.  If you want those guarantees, use O_SYNC as well.


This needs to be prominently documented.  Right now, it's far from clear
that you need both O_DIRECT and O_SYNC.


It's certainly not a requirement for NFS.  O_DIRECT on NFS forces data 
to the server, which always updates a file's metadata on each write, 
including indirect blocks.
begin:vcard
fn:Chuck Lever
n:Lever;Chuck
org:Oracle Corporation;Corporate Architecture: Linux Projects Group
adr:;;1015 Granger Avenue;Ann Arbor;MI;48104;USA
title:Principal Member of Staff
tel;work:+1 248 614 5091
x-mozilla-html:FALSE
version:2.1
end:vcard



Re: Correct behavior on O_DIRECT sparse file writes

2007-10-15 Thread Bryan Henderson
 The test below creates a sparse file and then fills a hole with
 O_DIRECT.  As far as I can tell from reading generic_osync_inode, the
 filesystem metadata is only forced to disk if i_size changes during the
 file write.  I've tested ext3, xfs and reiserfs and they all skip the
 commit when filling holes.
 
 I would argue that filling holes via O_DIRECT is supposed to commit the
 metadata required to find those file blocks later.  At least on ext3,
 O_SYNC does force a commit on fill holes  (haven't tested others).
 

I don't think it's a bug.  Sure, O_DIRECT is synchronous, but that's
because it is, err, direct.  Not because it provides extra data-integrity
guarantees.  If you want those guarantees, use O_SYNC as well.

That makes sense, but how do you explain the committing of the size change 
without O_SYNC?  That seems wrong to me.

This does need to be documented carefully, because a person could easily 
believe, even subconsciously,  that O_DIRECT makes the entire file write 
direct, and sloppy documentation might actually use words to that effect.

--
Bryan Henderson IBM Almaden Research Center
San Jose CA Filesystems

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.23-mm1: BUG in reiserfs_delete_xattrs

2007-10-15 Thread Jeff Mahoney
Christoph Hellwig wrote:
 On Mon, Oct 15, 2007 at 12:34:58AM +0200, Laurent Riffard wrote:
 reiserfs_delete_xattrs
 reiserfs_delete_inode
 generic_delete_inode
 generic_drop_inode
 iput
 do_unlinkat
 sys_unlink
 sys_enter_past_esp

 I reported a similar BUG in 2.6.22-rc8-mm2 (see
 http://lkml.org/lkml/2007/9/27/235). Dave Hansen sent a patch for it, I
 tested it and it was OK for 2.6.22-rc8-mm2.

 I tried this patch on 2.6.23-mm1, and it fixed the BUGs here too.
 
 The delete path is a similar case as the one Dave fixed, also cause by
 a NULL vfsmount passed to dentry_open, but through a different code-path.
 
 Untested fix for this problem below:

Here's a patch I worked up the other night that kills off struct file
completely from the xattr code. I've tested it locally.


 After several posts and bug reports regarding interaction with the NULL
 nameidata, here's a patch to clean up the mess with struct file in the
 reiserfs xattr code.

 As observed in several of the posts, there's really no need for struct file
 to exist in the xattr code. It was really only passed around due to the
 f_op-readdir() and a_ops-{prepare,commit}_write prototypes requiring it.

 reiserfs_prepare_write() and reiserfs_commit_write() don't actually use
 the struct file passed to it, and the xattr code uses a private version of
 reiserfs_readdir() to enumerate the xattr directories.

 I do have patches in my queue to convert the xattrs to use reiserfs_readdir(),
 but I guess I'll just have to rework those.

 This is pretty close to the patch by Dave Hansen for -mm, but I didn't
 notice it until after I wrote this up.

Signed-off-by: Jeff Mahoney [EMAIL PROTECTED]
---
 fs/reiserfs/xattr.c |  111 ++--
 1 file changed, 31 insertions(+), 80 deletions(-)

--- a/fs/reiserfs/xattr.c   2007-08-27 14:03:39.0 -0400
+++ b/fs/reiserfs/xattr.c   2007-10-14 22:11:05.0 -0400
@@ -191,28 +191,11 @@ static struct dentry *get_xa_file_dentry
dput(xadir);
if (err)
xafile = ERR_PTR(err);
-   return xafile;
-}
-
-/* Opens a file pointer to the attribute associated with inode */
-static struct file *open_xa_file(const struct inode *inode, const char *name,
-int flags)
-{
-   struct dentry *xafile;
-   struct file *fp;
-
-   xafile = get_xa_file_dentry(inode, name, flags);
-   if (IS_ERR(xafile))
-   return ERR_PTR(PTR_ERR(xafile));
else if (!xafile-d_inode) {
dput(xafile);
-   return ERR_PTR(-ENODATA);
+   xafile = ERR_PTR(-ENODATA);
}
-
-   fp = dentry_open(xafile, NULL, O_RDWR);
-   /* dentry_open dputs the dentry if it fails */
-
-   return fp;
+   return xafile;
 }
 
 /*
@@ -228,9 +211,8 @@ static struct file *open_xa_file(const s
  * we're called with i_mutex held, so there are no worries about the directory
  * changing underneath us.
  */
-static int __xattr_readdir(struct file *filp, void *dirent, filldir_t filldir)
+static int __xattr_readdir(struct inode *inode, void *dirent, filldir_t 
filldir)
 {
-   struct inode *inode = filp-f_path.dentry-d_inode;
struct cpu_key pos_key; /* key of current position in the directory 
(key of directory entry) */
INITIALIZE_PATH(path_to_entry);
struct buffer_head *bh;
@@ -374,23 +356,16 @@ static int __xattr_readdir(struct file *
  *
  */
 static
-int xattr_readdir(struct file *file, filldir_t filler, void *buf)
+int xattr_readdir(struct inode *inode, filldir_t filler, void *buf)
 {
-   struct inode *inode = file-f_path.dentry-d_inode;
-   int res = -ENOTDIR;
-   if (!file-f_op || !file-f_op-readdir)
-   goto out;
+   int res = -ENOENT;
mutex_lock_nested(inode-i_mutex, I_MUTEX_XATTR);
-//down(inode-i_zombie);
-   res = -ENOENT;
if (!IS_DEADDIR(inode)) {
lock_kernel();
-   res = __xattr_readdir(file, buf, filler);
+   res = __xattr_readdir(inode, buf, filler);
unlock_kernel();
}
-//up(inode-i_zombie);
mutex_unlock(inode-i_mutex);
-  out:
return res;
 }
 
@@ -436,7 +411,7 @@ reiserfs_xattr_set(struct inode *inode, 
   size_t buffer_size, int flags)
 {
int err = 0;
-   struct file *fp;
+   struct dentry *dentry;
struct page *page;
char *data;
struct address_space *mapping;
@@ -454,18 +429,18 @@ reiserfs_xattr_set(struct inode *inode, 
xahash = xattr_hash(buffer, buffer_size);
 
   open_file:
-   fp = open_xa_file(inode, name, flags);
-   if (IS_ERR(fp)) {
-   err = PTR_ERR(fp);
+   dentry = get_xa_file_dentry(inode, name, flags);
+   if (IS_ERR(dentry)) {
+   err = PTR_ERR(dentry);
goto out;
}
 
-   xinode = fp-f_path.dentry-d_inode;
+   xinode = 

Re: 2.6.23-mm1: BUG in reiserfs_delete_xattrs

2007-10-15 Thread Laurent Riffard
Le 15.10.2007 10:40, Christoph Hellwig a écrit :
 On Mon, Oct 15, 2007 at 12:34:58AM +0200, Laurent Riffard wrote:
 reiserfs_delete_xattrs
 reiserfs_delete_inode
 generic_delete_inode
 generic_drop_inode
 iput
 do_unlinkat
 sys_unlink
 sys_enter_past_esp

 I reported a similar BUG in 2.6.22-rc8-mm2 (see
 http://lkml.org/lkml/2007/9/27/235). Dave Hansen sent a patch for it, I
 tested it and it was OK for 2.6.22-rc8-mm2.

 I tried this patch on 2.6.23-mm1, and it fixed the BUGs here too.
 
 The delete path is a similar case as the one Dave fixed, also cause by
 a NULL vfsmount passed to dentry_open, but through a different code-path.
 
 Untested fix for this problem below:

Does work fine, thanks.

Tested-by: Laurent Riffard [EMAIL PROTECTED]

 
 Index: linux-2.6.23-rc8/fs/reiserfs/xattr.c
 ===
 --- linux-2.6.23-rc8.orig/fs/reiserfs/xattr.c 2007-09-30 14:13:46.0 
 +0200
 +++ linux-2.6.23-rc8/fs/reiserfs/xattr.c  2007-09-30 14:18:30.0 
 +0200
 @@ -207,9 +207,8 @@ static struct dentry *get_xa_file_dentry
   * we're called with i_mutex held, so there are no worries about the 
 directory
   * changing underneath us.
   */
 -static int __xattr_readdir(struct file *filp, void *dirent, filldir_t 
 filldir)
 +static int __xattr_readdir(struct inode *inode, void *dirent, filldir_t 
 filldir)
  {
 - struct inode *inode = filp-f_path.dentry-d_inode;
   struct cpu_key pos_key; /* key of current position in the directory 
 (key of directory entry) */
   INITIALIZE_PATH(path_to_entry);
   struct buffer_head *bh;
 @@ -352,24 +351,19 @@ static int __xattr_readdir(struct file *
   * this is stolen from vfs_readdir
   *
   */
 -static
 -int xattr_readdir(struct file *file, filldir_t filler, void *buf)
 +static int xattr_readdir(struct inode *inode, filldir_t filler, void *buf)
  {
 - struct inode *inode = file-f_path.dentry-d_inode;
   int res = -ENOTDIR;
 - if (!file-f_op || !file-f_op-readdir)
 - goto out;
 +
   mutex_lock_nested(inode-i_mutex, I_MUTEX_XATTR);
 -//down(inode-i_zombie);
   res = -ENOENT;
   if (!IS_DEADDIR(inode)) {
   lock_kernel();
 - res = __xattr_readdir(file, buf, filler);
 + res = __xattr_readdir(inode, buf, filler);
   unlock_kernel();
   }
 -//up(inode-i_zombie);
   mutex_unlock(inode-i_mutex);
 -  out:
 +
   return res;
  }
  
 @@ -721,7 +715,6 @@ reiserfs_delete_xattrs_filler(void *buf,
  /* This is called w/ inode-i_mutex downed */
  int reiserfs_delete_xattrs(struct inode *inode)
  {
 - struct file *fp;
   struct dentry *dir, *root;
   int err = 0;
  
 @@ -742,15 +735,8 @@ int reiserfs_delete_xattrs(struct inode 
   return 0;
   }
  
 - fp = dentry_open(dir, NULL, O_RDWR);
 - if (IS_ERR(fp)) {
 - err = PTR_ERR(fp);
 - /* dentry_open dputs the dentry if it fails */
 - goto out;
 - }
 -
   lock_kernel();
 - err = xattr_readdir(fp, reiserfs_delete_xattrs_filler, dir);
 + err = xattr_readdir(dir-d_inode, reiserfs_delete_xattrs_filler, dir);
   if (err) {
   unlock_kernel();
   goto out_dir;
 @@ -770,7 +756,7 @@ int reiserfs_delete_xattrs(struct inode 
   unlock_kernel();
  
out_dir:
 - fput(fp);
 + dput(dir);
  
out:
   if (!err)
 @@ -812,7 +798,6 @@ reiserfs_chown_xattrs_filler(void *buf, 
  
  int reiserfs_chown_xattrs(struct inode *inode, struct iattr *attrs)
  {
 - struct file *fp;
   struct dentry *dir;
   int err = 0;
   struct reiserfs_chown_buf buf;
 @@ -836,13 +821,6 @@ int reiserfs_chown_xattrs(struct inode *
   goto out;
   }
  
 - fp = dentry_open(dir, NULL, O_RDWR);
 - if (IS_ERR(fp)) {
 - err = PTR_ERR(fp);
 - /* dentry_open dputs the dentry if it fails */
 - goto out;
 - }
 -
   lock_kernel();
  
   attrs-ia_valid = (ATTR_UID | ATTR_GID | ATTR_CTIME);
 @@ -850,7 +828,7 @@ int reiserfs_chown_xattrs(struct inode *
   buf.attrs = attrs;
   buf.inode = inode;
  
 - err = xattr_readdir(fp, reiserfs_chown_xattrs_filler, buf);
 + err = xattr_readdir(dir-d_inode, reiserfs_chown_xattrs_filler, buf);
   if (err) {
   unlock_kernel();
   goto out_dir;
 @@ -860,7 +838,7 @@ int reiserfs_chown_xattrs(struct inode *
   unlock_kernel();
  
out_dir:
 - fput(fp);
 + dput(dir);
  
out:
   attrs-ia_valid = ia_valid;
 @@ -1008,7 +986,6 @@ reiserfs_listxattr_filler(void *buf, con
   */
  ssize_t reiserfs_listxattr(struct dentry * dentry, char *buffer, size_t size)
  {
 - struct file *fp;
   struct dentry *dir;
   int err = 0;
   struct reiserfs_listxattr_buf buf;
 @@ -1031,13 +1008,6 @@ ssize_t reiserfs_listxattr(struct dentry
   goto out;
   }
  
 - fp 

Re: 2.6.23-mm1: BUG in reiserfs_delete_xattrs

2007-10-15 Thread Laurent Riffard
Le 15.10.2007 20:31, Jeff Mahoney a écrit :
 Christoph Hellwig wrote:
 On Mon, Oct 15, 2007 at 12:34:58AM +0200, Laurent Riffard wrote:
 reiserfs_delete_xattrs
 reiserfs_delete_inode
 generic_delete_inode
 generic_drop_inode
 iput
 do_unlinkat
 sys_unlink
 sys_enter_past_esp

 I reported a similar BUG in 2.6.22-rc8-mm2 (see
 http://lkml.org/lkml/2007/9/27/235). Dave Hansen sent a patch for it, I
 tested it and it was OK for 2.6.22-rc8-mm2.

 I tried this patch on 2.6.23-mm1, and it fixed the BUGs here too.
 The delete path is a similar case as the one Dave fixed, also cause by
 a NULL vfsmount passed to dentry_open, but through a different code-path.

 Untested fix for this problem below:
 
 Here's a patch I worked up the other night that kills off struct file
 completely from the xattr code. I've tested it locally.

Sorry Jeff, your patch does not apply on 2.6.23-mm1. The 'struct file'
removal from reiserfs_xattr_ function is already in -mm
(make-reiserfs-stop-using-struct-file-for-internal.patch).

The Dave's patch I was refering to is this one: 

 BEGIN =
The bug is caused by reiserfs creating a special 'struct file' with a
NULL vfsmount.  

/* Opens a file pointer to the attribute associated with inode */
static struct file *open_xa_file(const struct inode *inode, const char
*name,
 int flags)
{
...
fp = dentry_open(xafile, NULL, O_RDWR);
/* dentry_open dputs the dentry if it fails */


As Christoph just said, this is somewhat of a bandaid.  But, it
shouldn't hurt anything.

---

 lxc-dave/fs/file_table.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff -puN fs/open.c~fix-reiserfs-oops fs/open.c
diff -puN fs/file_table.c~fix-reiserfs-oops fs/file_table.c
--- lxc/fs/file_table.c~fix-reiserfs-oops   2007-09-27 13:32:20.0 
-0700
+++ lxc-dave/fs/file_table.c2007-09-27 13:33:11.0 -0700
@@ -236,7 +236,7 @@ void fastcall __fput(struct file *file)
fops_put(file-f_op);
if (file-f_mode  FMODE_WRITE) {
put_write_access(inode);
-   if (!special_file(inode-i_mode))
+   if (!special_file(inode-i_mode)  mnt)
mnt_drop_write(mnt);
}
put_pid(file-f_owner.pid);
diff -puN include/linux/mount.h~fix-reiserfs-oops include/linux/mount.h
 END 

Dave sent it privately to me... I guess this bandaid is no longer
needed now, is it?

~~
laurent

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.23-mm1: BUG in reiserfs_delete_xattrs

2007-10-15 Thread Jeff Mahoney
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Laurent Riffard wrote:
 Le 15.10.2007 20:31, Jeff Mahoney a écrit :
 Christoph Hellwig wrote:
 On Mon, Oct 15, 2007 at 12:34:58AM +0200, Laurent Riffard wrote:
 reiserfs_delete_xattrs
 reiserfs_delete_inode
 generic_delete_inode
 generic_drop_inode
 iput
 do_unlinkat
 sys_unlink
 sys_enter_past_esp

 I reported a similar BUG in 2.6.22-rc8-mm2 (see
 http://lkml.org/lkml/2007/9/27/235). Dave Hansen sent a patch for it, I
 tested it and it was OK for 2.6.22-rc8-mm2.

 I tried this patch on 2.6.23-mm1, and it fixed the BUGs here too.
 The delete path is a similar case as the one Dave fixed, also cause by
 a NULL vfsmount passed to dentry_open, but through a different code-path.

 Untested fix for this problem below:
 Here's a patch I worked up the other night that kills off struct file
 completely from the xattr code. I've tested it locally.
 
 Sorry Jeff, your patch does not apply on 2.6.23-mm1. The 'struct file'
 removal from reiserfs_xattr_ function is already in -mm
 (make-reiserfs-stop-using-struct-file-for-internal.patch).
 
 The Dave's patch I was refering to is this one: 

I'd guess not. This patch was actually against mainline. I should've
specified. I can work up one against -mm later today if it's needed.

- -Jeff

- --
Jeff Mahoney
SUSE Labs
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.4-svn0 (GNU/Linux)
Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org

iD8DBQFHE8wyLPWxlyuTD7IRAiJrAJ4nC6gwH1cFjWx6BI04O5fDIRftmACcD2wb
whyXThHlIBK2phnZ6Pf8Pb8=
=Kx6k
-END PGP SIGNATURE-
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


More Large blocksize benchmarks

2007-10-15 Thread Chris Mason
Hello everyone,

I'm stealing the cc list and reviving and old thread because I've
finally got some numbers to go along with the Btrfs variable blocksize
feature.  The basic idea is to create a read/write interface to
map a range of bytes on the address space, and use it in Btrfs for all
metadata operations (file operations have always been extent based).

So, instead of casting buffer_head-b_data to some structure, I read and
write at offsets in a struct extent_buffer.  The extent buffer is very
small and backed by an address space, and I get large block sizes the
same way file_write gets to write to 16k at a time, by finding the
appropriate page in the addess space.  This is an over simplification
since I try to cache these mapping decisions to avoid using too much
CPU, but hopefully you get the idea.

The advantage to this approach is the changes are all inside Btrfs.  No
extra kernel patches were required.

Dave reported that XFS saw much higher write throughput with large
blocksizes, but so far I'm seeing the most benefits during reads.

The next step is a bunch more benchmarks.  I've done the first round
and posted it here:

http://oss.oracle.com/~mason/blocksizes/

The Btrfs code makes it relatively easy to experiment, and so this may
be a good step toward figuring out if some automagic solution is worth
it in general.  I can even use different sizes for nodes and leaves,
although I haven't done much testing at all there yet.

-chris

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/4] docbook: fix filesystems content

2007-10-15 Thread Randy Dunlap
From: Randy Dunlap [EMAIL PROTECTED]

Fix filesystems docbook warnings.

Warning(linux-2.6.23-git8//fs/debugfs/file.c:241): No description found for 
parameter 'name'
Warning(linux-2.6.23-git8//fs/debugfs/file.c:241): No description found for 
parameter 'mode'
Warning(linux-2.6.23-git8//fs/debugfs/file.c:241): No description found for 
parameter 'parent'
Warning(linux-2.6.23-git8//fs/debugfs/file.c:241): No description found for 
parameter 'value'
Warning(linux-2.6.23-git8//include/linux/jbd.h:404): No description found for 
parameter 'h_lockdep_map'

Signed-off-by: Randy Dunlap [EMAIL PROTECTED]
---
 fs/debugfs/file.c   |   41 +++--
 include/linux/jbd.h |1 +
 2 files changed, 36 insertions(+), 6 deletions(-)

--- linux-2.6.23-git8.orig/fs/debugfs/file.c
+++ linux-2.6.23-git8/fs/debugfs/file.c
@@ -227,15 +227,24 @@ DEFINE_SIMPLE_ATTRIBUTE(fops_x16, debugf
 
 DEFINE_SIMPLE_ATTRIBUTE(fops_x32, debugfs_u32_get, debugfs_u32_set, 
0x%08llx\n);
 
-/**
- * debugfs_create_x8 - create a debugfs file that is used to read and write an 
unsigned 8-bit value
- * debugfs_create_x16 - create a debugfs file that is used to read and write 
an unsigned 16-bit value
- * debugfs_create_x32 - create a debugfs file that is used to read and write 
an unsigned 32-bit value
+/*
+ * debugfs_create_x{8,16,32} - create a debugfs file that is used to read and 
write an unsigned {8,16,32}-bit value
  *
- * These functions are exactly the same as the above functions, (but use a hex
- * output for the decimal challenged) for details look at the above unsigned
+ * These functions are exactly the same as the above functions (but use a hex
+ * output for the decimal challenged). For details look at the above unsigned
  * decimal functions.
  */
+
+/**
+ * debugfs_create_x8 - create a debugfs file that is used to read and write an 
unsigned 8-bit value
+ * @name: a pointer to a string containing the name of the file to create.
+ * @mode: the permission that the file should have
+ * @parent: a pointer to the parent dentry for this file.  This should be a
+ *  directory dentry if set.  If this parameter is %NULL, then the
+ *  file will be created in the root of the debugfs filesystem.
+ * @value: a pointer to the variable that the file should read to and write
+ * from.
+ */
 struct dentry *debugfs_create_x8(const char *name, mode_t mode,
 struct dentry *parent, u8 *value)
 {
@@ -243,6 +252,16 @@ struct dentry *debugfs_create_x8(const c
 }
 EXPORT_SYMBOL_GPL(debugfs_create_x8);
 
+/**
+ * debugfs_create_x16 - create a debugfs file that is used to read and write 
an unsigned 16-bit value
+ * @name: a pointer to a string containing the name of the file to create.
+ * @mode: the permission that the file should have
+ * @parent: a pointer to the parent dentry for this file.  This should be a
+ *  directory dentry if set.  If this parameter is %NULL, then the
+ *  file will be created in the root of the debugfs filesystem.
+ * @value: a pointer to the variable that the file should read to and write
+ * from.
+ */
 struct dentry *debugfs_create_x16(const char *name, mode_t mode,
 struct dentry *parent, u16 *value)
 {
@@ -250,6 +269,16 @@ struct dentry *debugfs_create_x16(const 
 }
 EXPORT_SYMBOL_GPL(debugfs_create_x16);
 
+/**
+ * debugfs_create_x32 - create a debugfs file that is used to read and write 
an unsigned 32-bit value
+ * @name: a pointer to a string containing the name of the file to create.
+ * @mode: the permission that the file should have
+ * @parent: a pointer to the parent dentry for this file.  This should be a
+ *  directory dentry if set.  If this parameter is %NULL, then the
+ *  file will be created in the root of the debugfs filesystem.
+ * @value: a pointer to the variable that the file should read to and write
+ * from.
+ */
 struct dentry *debugfs_create_x32(const char *name, mode_t mode,
 struct dentry *parent, u32 *value)
 {
--- linux-2.6.23-git8.orig/include/linux/jbd.h
+++ linux-2.6.23-git8/include/linux/jbd.h
@@ -372,6 +372,7 @@ struct jbd_revoke_table_s;
  * @h_sync: flag for sync-on-close
  * @h_jdata: flag to force data journaling
  * @h_aborted: flag indicating fatal error on handle
+ * @h_lockdep_map: lockdep info for debugging lock problems
  **/
 
 /* Docbook can't yet cope with the bit fields, but will leave the documentation
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: More Large blocksize benchmarks

2007-10-15 Thread Christoph Lameter
On Mon, 15 Oct 2007, Chris Mason wrote:

 Dave reported that XFS saw much higher write throughput with large
 blocksizes, but so far I'm seeing the most benefits during reads.

Dave's tests were done with an early large blocksize patchset that had 
issues with readahead. More recent versions have the fixes by Fengguang 
that address the issue.
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: More Large blocksize benchmarks

2007-10-15 Thread David Chinner
On Mon, Oct 15, 2007 at 08:22:31PM -0400, Chris Mason wrote:
 Hello everyone,
 
 I'm stealing the cc list and reviving and old thread because I've
 finally got some numbers to go along with the Btrfs variable blocksize
 feature.  The basic idea is to create a read/write interface to
 map a range of bytes on the address space, and use it in Btrfs for all
 metadata operations (file operations have always been extent based).
 
 So, instead of casting buffer_head-b_data to some structure, I read and
 write at offsets in a struct extent_buffer.  The extent buffer is very
 small and backed by an address space, and I get large block sizes the
 same way file_write gets to write to 16k at a time, by finding the
 appropriate page in the addess space.  This is an over simplification
 since I try to cache these mapping decisions to avoid using too much
 CPU, but hopefully you get the idea.
 
 The advantage to this approach is the changes are all inside Btrfs.  No
 extra kernel patches were required.
 
 Dave reported that XFS saw much higher write throughput with large
 blocksizes, but so far I'm seeing the most benefits during reads.

Apples to oranges, Chris ;)

btrfs linearises writes due to it's COW behaviour and this is trades
off read speed. i.e. we take more seeks to read data so we can keep
the write speed high. By using large blocks, you're reducing the
number of seeks needed to find anything, and hence the read speed
will increase. Write speed will be pretty much unchanged because
btrfs does linear writes no matter the block size.

XFS doesn't linearise writes and optimises it's layout for a large
number of disks and a low number of seeks on reads - the opposite
of btrfs. Hence large block sizes reduce the number of writes XFS
needs to write a given set of data+metadata and hence write speed
increases much more than the read speed (until you get to large tree
traversals).

The basic conclusion is that different filesystems will benefit in
different ways with large block sizes

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html