date:20080124

possible deadlock shown by CONFIG_PROVE_LOCKING

2008-01-24 Thread Carlos Carvalho

I compiled the kernel with Ingo's CONFIG_PROVE_LOCKING and got the
below at boot. Is it a problem?

Lock dependency validator: Copyright (c) 2006 Red Hat, Inc., Ingo Molnar
... MAX_LOCKDEP_SUBCLASSES:8
... MAX_LOCK_DEPTH:  30
... MAX_LOCKDEP_KEYS:2048
... CLASSHASH_SIZE:   1024
... MAX_LOCKDEP_ENTRIES: 8192
... MAX_LOCKDEP_CHAINS:  16384
... CHAINHASH_SIZE:  8192
 memory used by lock dependency info: 1648 kB
 per task-struct memory footprint: 1680 bytes

| Locking API testsuite:

[removed]
---
Good, all 218 testcases passed! |
-

Further down

md: running: sdah1sdag1
raid1: raid set md3 active with 2 out of 2 mirrors
md: ... autorun DONE.
Filesystem md1: Disabling barriers, not supported by the underlying device
XFS mounting filesystem md1
Ending clean XFS mount for filesystem: md1
VFS: Mounted root (xfs filesystem).
Freeing unused kernel memory: 284k freed
Warning: unable to open an initial console.
Filesystem md1: Disabling barriers, not supported by the underlying device

===
[ INFO: possible circular locking dependency detected ]
2.6.22.16 #1
---
mount/1558 is trying to acquire lock:
 ((ip-i_lock)-mr_lock/1){--..}, at: [80312805] xfs_ilock+0x63/0x8d

but task is already holding lock:
 ((ip-i_lock)-mr_lock){}, at: [80312805] xfs_ilock+0x63/0x8d

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

- #1 ((ip-i_lock)-mr_lock){}:
   [80249fa6] __lock_acquire+0xa0f/0xb9f
   [8024a50d] lock_acquire+0x48/0x63
   [80312805] xfs_ilock+0x63/0x8d
   [8023c909] down_write_nested+0x38/0x46
   [80312805] xfs_ilock+0x63/0x8d
   [803132e8] xfs_iget_core+0x3ef/0x705
   [803136a2] xfs_iget+0xa4/0x14e
   [80328364] xfs_trans_iget+0xb4/0x128
   [80316a57] xfs_ialloc+0x9b/0x4b7
   [80249fc9] __lock_acquire+0xa32/0xb9f
   [80328d87] xfs_dir_ialloc+0x84/0x2cd
   [80312805] xfs_ilock+0x63/0x8d
   [8023c909] down_write_nested+0x38/0x46
   [8032e307] xfs_create+0x331/0x65f
   [80308163] xfs_dir2_leaf_lookup+0x1d/0x96
   [80338367] xfs_vn_mknod+0x12f/0x1f2
   [8027fb0a] vfs_create+0x6e/0x9e
   [80282af3] open_namei+0x1f7/0x6a9
   [8021843d] do_page_fault+0x438/0x78f
   [8027705a] do_filp_open+0x1c/0x3d
   [8045bf56] _spin_unlock+0x17/0x20
   [80276e3d] get_unused_fd+0x11c/0x12a
   [802770bb] do_sys_open+0x40/0x7b
   [802095be] system_call+0x7e/0x83
   [] 0x

- #0 ((ip-i_lock)-mr_lock/1){--..}:
   [80248896] print_circular_bug_header+0xcc/0xd3
   [80249ea2] __lock_acquire+0x90b/0xb9f
   [8024a50d] lock_acquire+0x48/0x63
   [80312805] xfs_ilock+0x63/0x8d
   [8023c909] down_write_nested+0x38/0x46
   [80312805] xfs_ilock+0x63/0x8d
   [8032bd30] xfs_lock_inodes+0x152/0x16d
   [8032e807] xfs_link+0x1d2/0x3f7
   [80249f3f] __lock_acquire+0x9a8/0xb9f
   [80337fe5] xfs_vn_link+0x3c/0x91
   [80248f4a] mark_held_locks+0x58/0x72
   [8045a9b7] __mutex_lock_slowpath+0x250/0x266
   [80249119] trace_hardirqs_on+0x115/0x139
   [8045a9c2] __mutex_lock_slowpath+0x25b/0x266
   [8027f88b] vfs_link+0xe8/0x124
   [802822d8] sys_linkat+0xcd/0x129
   [8045baaf] trace_hardirqs_on_thunk+0x35/0x37
   [80249119] trace_hardirqs_on+0x115/0x139
   [8045baaf] trace_hardirqs_on_thunk+0x35/0x37
   [802095be] system_call+0x7e/0x83
   [] 0x

other info that might help us debug this:

3 locks held by mount/1558:
 #0:  (inode-i_mutex/1){--..}, at: [802800f5] lookup_create+0x23/0x8
5
 #1:  (inode-i_mutex){--..}, at: [8027f878] vfs_link+0xd5/0x124
 #2:  ((ip-i_lock)-mr_lock){}, at: [80312805] xfs_ilock+0x63/0
x8d

stack backtrace:

Call Trace:
 [80248612] print_circular_bug_tail+0x69/0x72
 [80248896] print_circular_bug_header+0xcc/0xd3
 [80249ea2] __lock_acquire+0x90b/0xb9f
 [8024a50d] lock_acquire+0x48/0x63
 [80312805] xfs_ilock+0x63/0x8d
 [8023c909] down_write_nested+0x38/0x46
 [80312805] xfs_ilock+0x63/0x8d
 [8032bd30] xfs_lock_inodes+0x152/0x16d
 [8032e807] xfs_link+0x1d2/0x3f7
 [80249f3f] __lock_acquire+0x9a8/0xb9f
 [80337fe5] xfs_vn_link+0x3c/0x91

Re: [RFC] Parallelize IO for e2fsck

2008-01-24 Thread Bodo Eggert

Alan Cox [EMAIL PROTECTED] wrote:

 I'd tried to advocate SIGDANGER some years ago as well, but none of
 the kernel maintainers were interested.  It definitely makes sense
 to have some sort of mechanism like this.  At the time I first brought
 it up it was in conjunction with Netscape using too much cache on some
 system, but it would be just as useful for all kinds of other memory-
 hungry applications.
 
 There is an early thread for a /proc file which you can add to your
 poll() set and it will wake people when memory is low. Very elegant and
 if async support is added it will also give you the signal variant for
 free.

IMO you'll need a userspace daemon. The kernel does only know about the
amount of memory available / recommended for a system (or container),
while the user knows which program's cache is most precious today.

(Off cause the userspace daemon will in turn need the /proc file.)

I think a single, system-wide signal is the second-to worst solution: All
applications (or the wrong one, if you select one) would free their caches
and start to crawl, and either stay in this state or slowly increase their
caches again until they get signaled again. And the signal would either
come too early or too late. The userspace daemon could collect the weighted
demand of memory from all applications and tell them how much to use.

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[patch 00/26] mount options: fix filesystem's -show_options

2008-01-24 Thread Miklos Szeredi

Andrew,

Would you please consider these patches for -mm?  They should be
relatively uncontroversial and straightforward fixes.

They touch a lot of filesystems though, so not sure about the
logistics...

For the description, see first patch's header.

Thanks,
Miklos

--
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[patch 03/26] mount options: fix adfs

2008-01-24 Thread Miklos Szeredi

From: Miklos Szeredi [EMAIL PROTECTED]

Add a .show_options super operation to adfs.

Signed-off-by: Miklos Szeredi [EMAIL PROTECTED]
---

Index: linux/fs/adfs/super.c
===
--- linux.orig/fs/adfs/super.c  2008-01-24 13:48:43.0 +0100
+++ linux/fs/adfs/super.c   2008-01-24 15:55:26.0 +0100
@@ -20,6 +20,8 @@
 #include linux/vfs.h
 #include linux/parser.h
 #include linux/bitops.h
+#include linux/mount.h
+#include linux/seq_file.h
 
 #include asm/uaccess.h
 #include asm/system.h
@@ -30,6 +32,9 @@
 #include dir_f.h
 #include dir_fplus.h
 
+#define ADFS_DEFAULT_OWNER_MASK S_IRWXU
+#define ADFS_DEFAULT_OTHER_MASK (S_IRWXG | S_IRWXO)
+
 void __adfs_error(struct super_block *sb, const char *function, const char 
*fmt, ...)
 {
char error_buf[128];
@@ -134,6 +139,22 @@ static void adfs_put_super(struct super_
sb-s_fs_info = NULL;
 }
 
+static int adfs_show_options(struct seq_file *seq, struct vfsmount *mnt)
+{
+   struct adfs_sb_info *asb = ADFS_SB(mnt-mnt_sb);
+
+   if (asb-s_uid != 0)
+   seq_printf(seq, ,uid=%u, asb-s_uid);
+   if (asb-s_gid != 0)
+   seq_printf(seq, ,gid=%u, asb-s_gid);
+   if (asb-s_owner_mask != ADFS_DEFAULT_OWNER_MASK)
+   seq_printf(seq, ,ownmask=%o, asb-s_owner_mask);
+   if (asb-s_other_mask != ADFS_DEFAULT_OTHER_MASK)
+   seq_printf(seq, ,othmask=%o, asb-s_other_mask);
+
+   return 0;
+}
+
 enum {Opt_uid, Opt_gid, Opt_ownmask, Opt_othmask, Opt_err};
 
 static match_table_t tokens = {
@@ -259,6 +280,7 @@ static const struct super_operations adf
.put_super  = adfs_put_super,
.statfs = adfs_statfs,
.remount_fs = adfs_remount,
+   .show_options   = adfs_show_options,
 };
 
 static struct adfs_discmap *adfs_read_map(struct super_block *sb, struct 
adfs_discrecord *dr)
@@ -344,8 +366,8 @@ static int adfs_fill_super(struct super_
/* set default options */
asb-s_uid = 0;
asb-s_gid = 0;
-   asb-s_owner_mask = S_IRWXU;
-   asb-s_other_mask = S_IRWXG | S_IRWXO;
+   asb-s_owner_mask = ADFS_DEFAULT_OWNER_MASK;
+   asb-s_other_mask = ADFS_DEFAULT_OTHER_MASK;
 
if (parse_options(sb, data))
goto error;

--
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[patch 04/26] mount options: fix affs

2008-01-24 Thread Miklos Szeredi

From: Miklos Szeredi [EMAIL PROTECTED]

Add a .show_options super operation to affs.

Use generic_show_options() and save the complete option string in
affs_fill_super() and affs_remount().

Signed-off-by: Miklos Szeredi [EMAIL PROTECTED]
---

Index: linux/fs/affs/super.c
===
--- linux.orig/fs/affs/super.c  2008-01-24 18:57:19.0 +0100
+++ linux/fs/affs/super.c   2008-01-24 19:01:21.0 +0100
@@ -122,6 +122,7 @@ static const struct super_operations aff
.write_super= affs_write_super,
.statfs = affs_statfs,
.remount_fs = affs_remount,
+   .show_options   = generic_show_options,
 };
 
 enum {
@@ -272,6 +273,8 @@ static int affs_fill_super(struct super_
u8   sig[4];
int  ret = -EINVAL;
 
+   save_mount_options(sb, data);
+
pr_debug(AFFS: read_super(%s)\n,data ? (const char *)data : no 
options);
 
sb-s_magic = AFFS_SUPER_MAGIC;
@@ -487,14 +490,21 @@ affs_remount(struct super_block *sb, int
int  root_block;
unsigned longmount_flags;
int  res = 0;
+   char*new_opts = kstrdup(data, GFP_KERNEL);
 
pr_debug(AFFS: remount(flags=0x%x,opts=\%s\)\n,*flags,data);
 
*flags |= MS_NODIRATIME;
 
-   if (!parse_options(data,uid,gid,mode,reserved,root_block,
-   blocksize,sbi-s_prefix,sbi-s_volume,mount_flags))
+   if (!parse_options(data, uid, gid, mode, reserved, root_block,
+  blocksize, sbi-s_prefix, sbi-s_volume,
+  mount_flags)) {
+   kfree(new_opts);
return -EINVAL;
+   }
+   kfree(sb-s_options);
+   sb-s_options = new_opts;
+
sbi-s_flags = mount_flags;
sbi-s_mode  = mode;
sbi-s_uid   = uid;

--
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[patch 05/26] mount options: fix afs

2008-01-24 Thread Miklos Szeredi

From: Miklos Szeredi [EMAIL PROTECTED]

Add a .show_options super operation to afs.

Use generic_show_options() and save the complete option string in
afs_get_sb().

Signed-off-by: Miklos Szeredi [EMAIL PROTECTED]
---

Index: linux/fs/afs/super.c
===
--- linux.orig/fs/afs/super.c   2008-01-24 11:42:44.0 +0100
+++ linux/fs/afs/super.c2008-01-24 12:05:50.0 +0100
@@ -52,6 +52,7 @@ static const struct super_operations afs
.clear_inode= afs_clear_inode,
.umount_begin   = afs_umount_begin,
.put_super  = afs_put_super,
+   .show_options   = generic_show_options,
 };
 
 static struct kmem_cache *afs_inode_cachep;
@@ -357,6 +358,7 @@ static int afs_get_sb(struct file_system
struct super_block *sb;
struct afs_volume *vol;
struct key *key;
+   char *new_opts = kstrdup(options, GFP_KERNEL);
int ret;
 
_enter(,,%s,%p, dev_name, options);
@@ -408,9 +410,11 @@ static int afs_get_sb(struct file_system
deactivate_super(sb);
goto error;
}
+   sb-s_options = new_opts;
sb-s_flags |= MS_ACTIVE;
} else {
_debug(reuse);
+   kfree(new_opts);
ASSERTCMP(sb-s_flags, , MS_ACTIVE);
}
 
@@ -424,6 +428,7 @@ error:
afs_put_volume(params.volume);
afs_put_cell(params.cell);
key_put(params.key);
+   kfree(new_opts);
_leave( = %d, ret);
return ret;
 }

--
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[patch 06/26] mount options: fix autofs4

2008-01-24 Thread Miklos Szeredi

From: Miklos Szeredi [EMAIL PROTECTED]

Add uid= and gid= options to /proc/mounts for autofs4 filesystems.

Signed-off-by: Miklos Szeredi [EMAIL PROTECTED]
---

Index: linux/fs/autofs4/inode.c
===
--- linux.orig/fs/autofs4/inode.c   2008-01-22 15:52:42.0 +0100
+++ linux/fs/autofs4/inode.c2008-01-22 23:36:02.0 +0100
@@ -188,11 +188,16 @@ out_kill_sb:
 static int autofs4_show_options(struct seq_file *m, struct vfsmount *mnt)
 {
struct autofs_sb_info *sbi = autofs4_sbi(mnt-mnt_sb);
+   struct inode *root_inode = mnt-mnt_sb-s_root-d_inode;
 
if (!sbi)
return 0;
 
seq_printf(m, ,fd=%d, sbi-pipefd);
+   if (root_inode-i_uid != 0)
+   seq_printf(m, ,uid=%u, root_inode-i_uid);
+   if (root_inode-i_gid != 0)
+   seq_printf(m, ,gid=%u, root_inode-i_gid);
seq_printf(m, ,pgrp=%d, sbi-oz_pgrp);
seq_printf(m, ,timeout=%lu, sbi-exp_timeout/HZ);
seq_printf(m, ,minproto=%d, sbi-min_proto);

--
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[patch 02/26] mount options: add generic_show_options()

2008-01-24 Thread Miklos Szeredi

From: Miklos Szeredi [EMAIL PROTECTED]

Add a new s_options field to struct super_block.  Filesystems can save
mount options passed to them in mount or remount.  It is automatically
freed when the superblock is destroyed.

A new helper function, generic_show_options() is introduced, which uses
this field to display the mount options in /proc/mounts.

Another helper function, save_mount_options() may be used by
filesystems to save the options in the super block.

Signed-off-by: Miklos Szeredi [EMAIL PROTECTED]
---

Index: linux/fs/namespace.c
===
--- linux.orig/fs/namespace.c   2008-01-24 17:07:46.0 +0100
+++ linux/fs/namespace.c2008-01-24 17:34:50.0 +0100
@@ -575,6 +575,50 @@ void mnt_unpin(struct vfsmount *mnt)
 
 EXPORT_SYMBOL(mnt_unpin);
 
+static inline void mangle(struct seq_file *m, const char *s)
+{
+   seq_escape(m, s,  \t\n\\);
+}
+
+/*
+ * Simple .show_options callback for filesystems which don't want to
+ * implement more complex mount option showing.
+ *
+ * See also save_mount_options().
+ */
+int generic_show_options(struct seq_file *m, struct vfsmount *mnt)
+{
+   const char *options = mnt-mnt_sb-s_options;
+
+   if (options != NULL  options[0]) {
+   seq_putc(m, ',');
+   mangle(m, options);
+   }
+
+   return 0;
+}
+EXPORT_SYMBOL(generic_show_options);
+
+/*
+ * If filesystem uses generic_show_options(), this function should be
+ * called from the fill_super() callback.
+ *
+ * The .remount_fs callback usually needs to be handled in a special
+ * way, to make sure, that previous options are not overwritten if the
+ * remount fails.
+ *
+ * Also note, that if the filesystem's .remount_fs function doesn't
+ * reset all options to their default value, but changes only newly
+ * given options, then the displayed options will not reflect reality
+ * any more.
+ */
+void save_mount_options(struct super_block *sb, char *options)
+{
+   kfree(sb-s_options);
+   sb-s_options = kstrdup(options, GFP_KERNEL);
+}
+EXPORT_SYMBOL(save_mount_options);
+
 /* iterator */
 static void *m_start(struct seq_file *m, loff_t *pos)
 {
@@ -596,11 +640,6 @@ static void m_stop(struct seq_file *m, v
up_read(namespace_sem);
 }
 
-static inline void mangle(struct seq_file *m, const char *s)
-{
-   seq_escape(m, s,  \t\n\\);
-}
-
 static int show_vfsmnt(struct seq_file *m, void *v)
 {
struct vfsmount *mnt = list_entry(v, struct vfsmount, mnt_list);
Index: linux/fs/super.c
===
--- linux.orig/fs/super.c   2008-01-24 17:07:46.0 +0100
+++ linux/fs/super.c2008-01-24 17:12:33.0 +0100
@@ -105,6 +105,7 @@ static inline void destroy_super(struct 
 {
security_sb_free(s);
kfree(s-s_subtype);
+   kfree(s-s_options);
kfree(s);
 }
 
Index: linux/include/linux/fs.h
===
--- linux.orig/include/linux/fs.h   2008-01-24 17:07:46.0 +0100
+++ linux/include/linux/fs.h2008-01-24 17:12:33.0 +0100
@@ -1042,6 +1042,12 @@ struct super_block {
 * in /proc/mounts will be type.subtype
 */
char *s_subtype;
+
+   /*
+* Saved mount options for lazy filesystems using
+* generic_show_options()
+*/
+   char *s_options;
 };
 
 extern struct timespec current_fs_time(struct super_block *sb);
@@ -1992,6 +1998,9 @@ extern int __must_check inode_setattr(st
 
 extern void file_update_time(struct file *file);
 
+extern int generic_show_options(struct seq_file *m, struct vfsmount *mnt);
+extern void save_mount_options(struct super_block *sb, char *options);
+
 static inline ino_t parent_ino(struct dentry *dentry)
 {
ino_t res;

--
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[patch 09/26] mount options: fix capifs

2008-01-24 Thread Miklos Szeredi

From: Miklos Szeredi [EMAIL PROTECTED]

Add a .show_options super operation to capifs.

Use generic_show_options() and save the complete option string in
capifs_remount().

Signed-off-by: Miklos Szeredi [EMAIL PROTECTED]
---

Index: linux/drivers/isdn/capi/capifs.c
===
--- linux.orig/drivers/isdn/capi/capifs.c   2007-10-09 22:31:38.0 
+0200
+++ linux/drivers/isdn/capi/capifs.c2008-01-24 11:37:42.0 +0100
@@ -52,6 +52,7 @@ static int capifs_remount(struct super_b
gid_t gid = 0;
umode_t mode = 0600;
char *this_char;
+   char *new_opt = kstrdup(data, GFP_KERNEL);
 
this_char = NULL;
while ((this_char = strsep(data, ,)) != NULL) {
@@ -72,11 +73,16 @@ static int capifs_remount(struct super_b
return -EINVAL;
}
}
+
+   kfree(s-s_options);
+   s-s_options = new_opt;
+
config.setuid  = setuid;
config.setgid  = setgid;
config.uid = uid;
config.gid = gid;
config.mode= mode;
+
return 0;
 }
 
@@ -84,6 +90,7 @@ static struct super_operations capifs_so
 {
.statfs = simple_statfs,
.remount_fs = capifs_remount,
+   .show_options   = generic_show_options,
 };
 
 

--
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[patch 08/26] mount options: fix befs

2008-01-24 Thread Miklos Szeredi

From: Miklos Szeredi [EMAIL PROTECTED]

Add a .show_options super operation to befs.

Use generic_show_options() and save the complete option string in
befs_fill_super().

Signed-off-by: Miklos Szeredi [EMAIL PROTECTED]
---

Index: linux/fs/befs/linuxvfs.c
===
--- linux.orig/fs/befs/linuxvfs.c   2008-01-17 19:00:54.0 +0100
+++ linux/fs/befs/linuxvfs.c2008-01-22 21:40:05.0 +0100
@@ -57,6 +57,7 @@ static const struct super_operations bef
.put_super  = befs_put_super,   /* uninit super */
.statfs = befs_statfs,  /* statfs */
.remount_fs = befs_remount,
+   .show_options   = generic_show_options,
 };
 
 /* slab cache for befs_inode_info objects */
@@ -759,10 +760,11 @@ befs_fill_super(struct super_block *sb, 
befs_super_block *disk_sb;
struct inode *root;
long ret = -EINVAL;
-
const unsigned long sb_block = 0;
const off_t x86_sb_off = 512;
 
+   save_mount_options(sb, data);
+
sb-s_fs_info = kmalloc(sizeof (*befs_sb), GFP_KERNEL);
if (sb-s_fs_info == NULL) {
printk(KERN_ERR

--
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[patch 10/26] mount options: fix devpts

2008-01-24 Thread Miklos Szeredi

From: Miklos Szeredi [EMAIL PROTECTED]

Add a .show_options super operation to devpts.

Also add minor fix: when parsing the mode option, mask with
S_IALLUGO instead of ~S_IFMT, which could leave unsed bits in the
mask.

Signed-off-by: Miklos Szeredi [EMAIL PROTECTED]
---

Index: linux/fs/devpts/inode.c
===
--- linux.orig/fs/devpts/inode.c2008-01-22 23:43:12.0 +0100
+++ linux/fs/devpts/inode.c 2008-01-23 13:01:05.0 +0100
@@ -20,9 +20,12 @@
 #include linux/devpts_fs.h
 #include linux/parser.h
 #include linux/fsnotify.h
+#include linux/seq_file.h
 
 #define DEVPTS_SUPER_MAGIC 0x1cd1
 
+#define DEVPTS_DEFAULT_MODE 0600
+
 static struct vfsmount *devpts_mnt;
 static struct dentry *devpts_root;
 
@@ -32,7 +35,7 @@ static struct {
uid_t   uid;
gid_t   gid;
umode_t mode;
-} config = {.mode = 0600};
+} config = {.mode = DEVPTS_DEFAULT_MODE};
 
 enum {
Opt_uid, Opt_gid, Opt_mode,
@@ -54,7 +57,7 @@ static int devpts_remount(struct super_b
config.setgid  = 0;
config.uid = 0;
config.gid = 0;
-   config.mode= 0600;
+   config.mode= DEVPTS_DEFAULT_MODE;
 
while ((p = strsep(data, ,)) != NULL) {
substring_t args[MAX_OPT_ARGS];
@@ -81,7 +84,7 @@ static int devpts_remount(struct super_b
case Opt_mode:
if (match_octal(args[0], option))
return -EINVAL;
-   config.mode = option  ~S_IFMT;
+   config.mode = option  S_IALLUGO;
break;
default:
printk(KERN_ERR devpts: called with bogus options\n);
@@ -92,9 +95,22 @@ static int devpts_remount(struct super_b
return 0;
 }
 
+static int devpts_show_options(struct seq_file *seq, struct vfsmount *vfs)
+{
+   if (config.setuid)
+   seq_printf(seq, ,uid=%u, config.uid);
+   if (config.setgid)
+   seq_printf(seq, ,gid=%u, config.gid);
+   if (config.mode != DEVPTS_DEFAULT_MODE)
+   seq_printf(seq, ,mode=%03o, config.mode);
+
+   return 0;
+}
+
 static const struct super_operations devpts_sops = {
.statfs = simple_statfs,
.remount_fs = devpts_remount,
+   .show_options   = devpts_show_options,
 };
 
 static int

--
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[patch 11/26] mount options: fix ext2

2008-01-24 Thread Miklos Szeredi

From: Miklos Szeredi [EMAIL PROTECTED]

Add noreservation option to /proc/mounts for ext2 filesystems.

Signed-off-by: Miklos Szeredi [EMAIL PROTECTED]
---

Index: linux/fs/ext2/super.c
===
--- linux.orig/fs/ext2/super.c  2008-01-17 19:00:55.0 +0100
+++ linux/fs/ext2/super.c   2008-01-23 21:38:08.0 +0100
@@ -285,6 +285,9 @@ static int ext2_show_options(struct seq_
seq_puts(seq, ,xip);
 #endif
 
+   if (!test_opt(sb, RESERVATION))
+   seq_puts(seq, ,noreservation);
+
return 0;
 }
 

--
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[patch 12/26] mount options: fix ext4

2008-01-24 Thread Miklos Szeredi

From: Miklos Szeredi [EMAIL PROTECTED]

Add stripe= option to /proc/mounts for ext4 filesystems.

Signed-off-by: Miklos Szeredi [EMAIL PROTECTED]
---

Index: linux/fs/ext4/super.c
===
--- linux.orig/fs/ext4/super.c  2008-01-23 12:57:07.0 +0100
+++ linux/fs/ext4/super.c   2008-01-23 21:43:51.0 +0100
@@ -742,7 +742,8 @@ static int ext4_show_options(struct seq_
seq_puts(seq, ,nomballoc);
if (!test_opt(sb, DELALLOC))
seq_puts(seq, ,nodelalloc);
-
+   if (sbi-s_stripe)
+   seq_printf(seq, ,stripe=%lu, sbi-s_stripe);
 
/*
 * journal mode get enabled in different ways

--
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[patch 01/26] mount options: add documentation

2008-01-24 Thread Miklos Szeredi

From: Miklos Szeredi [EMAIL PROTECTED]

This series addresses the problem of showing mount options in
/proc/mounts.

Several filesystems which use mount options, have not implemented a
.show_options superblock operation.  Several others have implemented
this callback, but have not kept it fully up to date with the parsed
options.

Q: Why do we need correct option showing in /proc/mounts?
A: We want /proc/mounts to fully replace /etc/mtab.  The reasons for
   this are:
- unprivileged mounters won't be able to update /etc/mtab
- /etc/mtab doesn't work with private mount namespaces
- /etc/mtab can become out-of-sync with reality

Q: Can't this be done, so that filesystems need not bother with
   implementing a .show_mounts callback, and keeping it up to date?
A: Only in some cases.  Certain filesystems allow modification of a
   subset of options in their remount_fs method.  It is not possible
   to take this into account without knowing exactly how the
   filesystem handles options.

For the simple case (no remount or remount resets all options) the
patchset introduces two helpers:

  generic_show_options()
  save_mount_options()

These can also be used to emulate the old /etc/mtab behavior, until
proper support is added.  Even if this is not 100% correct, it's still
better than showing no options at all.

The following patches fix up most in-tree filesystems, they have been
compile tested only.  I would like to ask maintainers (CC-d on
respective patches) to please review, test and ACK these changes.

The following filesystems still need fixing: CIFS, NFS, XFS, Unionfs,
Reiser4.  For CIFS, NFS and XFS I wasn't able to understand how some
of the options are used.  The last two are not yet in mainline, so I
leave fixing those to their respective maintainers out of pure
laziness.

Table displaying status of all in-kernel filesystems:
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
legend:

  none - fs has options, but doesn't define -show_options()
  some - fs defines -show_options(), but some only options are shown
  most - fs defines -show_options(), and shows most of them
  good - fs shows all options
  noopt - fs does not have options
  patch - a patch will be posted

9p  good
adfspatch
affspatch
afs patch
autofs  patch
autofs4 patch
befspatch
bfs noopt
cifssome
codanoopt
configfsnoopt
cramfs  noopt
debugfs noopt
devpts  patch
ecryptfsgood
efs noopt
ext2patch
ext3good
ext4patch
fat patch
freevxfsnoopt
fusepatch
fusectl noopt
gfs2good
gfs2metanoopt
hfs good
hfsplus good
hostfs  patch
hpfspatch
hppfs   noopt
hugetlbfs   patch
isofs   patch
jffs2   noopt
jfs patch
minix   noopt
msdos   -fat
ncpfs   patch
nfs patch,most
nfsdnoopt
ntfsgood
ocfs2   good
ocfs2/dlmfs noopt
openpromfs  noopt
procnoopt
qnx4noopt
ramfs   noopt
reiserfspatch
romfs   noopt
smbfs   good
sysfs   noopt
sysvnoopt
udf patch
ufs good
vfat-fat
xfs most

mm/shmem.cpatch
drivers/oprofile/oprofilefs.c noopt
drivers/infiniband/hw/ipath/ipath_fs.cnoopt
drivers/misc/ibmasm/ibmasmfs.cnoopt
drivers/usb/core (usbfs)  patch
drivers/usb/gadget (gadgetfs) noopt
drivers/isdn/capi/capifs.cpatch
kernel/cpuset.c   noopt
fs/binfmt_misc.c  noopt
net/sunrpc/rpc_pipe.c noopt
arch/powerpc/platforms/cell/spufs patch
arch/s390/hypfs   good
ipc/mqueue.c  noopt
security (securityfs) noopt
security/selinux/selinuxfs.c  noopt
kernel/cgroup.c   good
security/smack/smackfs.c  noopt

in -mm:

reiser4 some
unionfs none
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

This patch:

Document the rules for handling mount options in the .show_options
super operation.

Signed-off-by: Miklos Szeredi [EMAIL PROTECTED]
---

Index: linux/Documentation/filesystems/vfs.txt
===
--- linux.orig/Documentation/filesystems/vfs.txt2008-01-24 
11:42:48.0 +0100
+++ linux/Documentation/filesystems/vfs.txt 2008-01-24 17:12:25.0 
+0100
@@ -151,7 +151,7 @@ The get_sb() method has the following ar
   const char *dev_name: the device name we are mounting.
 
   void *data: arbitrary mount options, usually comes as an ASCII
-   string
+   string (see Mount Options section)
 
   struct vfsmount *mnt: a vfs-internal representation of a mount point
 
@@ -182,7

[patch 14/26] mount options: fix fuse

2008-01-24 Thread Miklos Szeredi

From: Miklos Szeredi [EMAIL PROTECTED]

Add blksize= option to /proc/mounts for fuseblk filesystems.

Signed-off-by: Miklos Szeredi [EMAIL PROTECTED]
---

Index: linux/fs/fuse/inode.c
===
--- linux.orig/fs/fuse/inode.c  2008-01-19 11:56:34.0 +0100
+++ linux/fs/fuse/inode.c   2008-01-21 17:53:06.0 +0100
@@ -29,6 +29,8 @@ DEFINE_MUTEX(fuse_mutex);
 
 #define FUSE_SUPER_MAGIC 0x65735546
 
+#define FUSE_DEFAULT_BLKSIZE 512
+
 struct fuse_mount_data {
int fd;
unsigned rootmode;
@@ -355,7 +357,7 @@ static int parse_fuse_opt(char *opt, str
char *p;
memset(d, 0, sizeof(struct fuse_mount_data));
d-max_read = ~0;
-   d-blksize = 512;
+   d-blksize = FUSE_DEFAULT_BLKSIZE;
 
while ((p = strsep(opt, ,)) != NULL) {
int token;
@@ -440,6 +442,9 @@ static int fuse_show_options(struct seq_
seq_puts(m, ,allow_other);
if (fc-max_read != ~0)
seq_printf(m, ,max_read=%u, fc-max_read);
+   if (mnt-mnt_sb-s_bdev 
+   mnt-mnt_sb-s_blocksize != FUSE_DEFAULT_BLKSIZE)
+   seq_printf(m, ,blksize=%lu, mnt-mnt_sb-s_blocksize);
return 0;
 }
 

--
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[patch 15/26] mount options: fix hostfs

2008-01-24 Thread Miklos Szeredi

From: Miklos Szeredi [EMAIL PROTECTED]

Add the host path option to /proc/mounts for UML hostfs filesystems.

The mount source (mnt_devname) should really be used for this, but not
easy to change now in a backward compatible way.

Signed-off-by: Miklos Szeredi [EMAIL PROTECTED]
---

Index: linux/fs/hostfs/hostfs_kern.c
===
--- linux.orig/fs/hostfs/hostfs_kern.c  2008-01-17 19:00:55.0 +0100
+++ linux/fs/hostfs/hostfs_kern.c   2008-01-21 19:19:55.0 +0100
@@ -11,6 +11,7 @@
 #include linux/mm.h
 #include linux/pagemap.h
 #include linux/statfs.h
+#include linux/seq_file.h
 #include hostfs.h
 #include init.h
 #include kern.h
@@ -322,12 +323,25 @@ static void hostfs_destroy_inode(struct 
kfree(HOSTFS_I(inode));
 }
 
+static int hostfs_show_options(struct seq_file *seq, struct vfsmount *vfs)
+{
+   struct inode *root = vfs-mnt_sb-s_root-d_inode;
+   const char *root_path = HOSTFS_I(root)-host_filename;
+   size_t offset = strlen(root_ino) + 1;
+
+   if (strlen(root_path)  offset)
+   seq_printf(seq, ,%s, root_path + offset);
+
+   return 0;
+}
+
 static const struct super_operations hostfs_sbops = {
.alloc_inode= hostfs_alloc_inode,
.drop_inode = generic_delete_inode,
.delete_inode   = hostfs_delete_inode,
.destroy_inode  = hostfs_destroy_inode,
.statfs = hostfs_statfs,
+   .show_options   = hostfs_show_options,
 };
 
 int hostfs_readdir(struct file *file, void *ent, filldir_t filldir)

--
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[patch 16/26] mount options: fix hpfs

2008-01-24 Thread Miklos Szeredi

From: Miklos Szeredi [EMAIL PROTECTED]

Add a .show_options super operation to hpfs.

Use generic_show_options() and save the complete option string in
hpfs_fill_super() and hpfs_remount_fs().

Also add a small fix: hpfs_remount_fs() should return -EINVAL on
error, instead of 1, which is not an error value.

Signed-off-by: Miklos Szeredi [EMAIL PROTECTED]
---

Index: linux/fs/hpfs/super.c
===
--- linux.orig/fs/hpfs/super.c  2008-01-17 19:00:14.0 +0100
+++ linux/fs/hpfs/super.c   2008-01-23 23:36:53.0 +0100
@@ -386,6 +386,7 @@ static int hpfs_remount_fs(struct super_
int lowercase, conv, eas, chk, errs, chkdsk, timeshift;
int o;
struct hpfs_sb_info *sbi = hpfs_sb(s);
+   char *new_opts = kstrdup(data, GFP_KERNEL);

*flags |= MS_NOATIME;

@@ -398,15 +399,15 @@ static int hpfs_remount_fs(struct super_
if (!(o = parse_opts(data, uid, gid, umask, lowercase, conv,
eas, chk, errs, chkdsk, timeshift))) {
printk(HPFS: bad mount options.\n);
-   return 1;
+   goto out_err;
}
if (o == 2) {
hpfs_help();
-   return 1;
+   goto out_err;
}
if (timeshift != sbi-sb_timeshift) {
printk(HPFS: timeshift can't be changed using remount.\n);
-   return 1;
+   goto out_err;
}
 
unmark_dirty(s);
@@ -419,7 +420,14 @@ static int hpfs_remount_fs(struct super_
 
if (!(*flags  MS_RDONLY)) mark_dirty(s);
 
+   kfree(s-s_options);
+   s-s_options = new_opts;
+
return 0;
+
+out_err:
+   kfree(new_opts);
+   return -EINVAL;
 }
 
 /* Super operations */
@@ -432,6 +440,7 @@ static const struct super_operations hpf
.put_super  = hpfs_put_super,
.statfs = hpfs_statfs,
.remount_fs = hpfs_remount_fs,
+   .show_options   = generic_show_options,
 };
 
 static int hpfs_fill_super(struct super_block *s, void *options, int silent)
@@ -454,6 +463,8 @@ static int hpfs_fill_super(struct super_
 
int o;
 
+   save_mount_options(s, options);
+
sbi = kzalloc(sizeof(*sbi), GFP_KERNEL);
if (!sbi)
return -ENOMEM;

--
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[patch 18/26] mount options: fix isofs

2008-01-24 Thread Miklos Szeredi

From: Miklos Szeredi [EMAIL PROTECTED]

Add a .show_options super operation to isofs.

Use generic_show_options() and save the complete option string in
isofs_fill_super().

Signed-off-by: Miklos Szeredi [EMAIL PROTECTED]
---

Index: linux/fs/isofs/inode.c
===
--- linux.orig/fs/isofs/inode.c 2008-01-17 19:00:55.0 +0100
+++ linux/fs/isofs/inode.c  2008-01-23 22:07:51.0 +0100
@@ -110,6 +110,7 @@ static const struct super_operations iso
.put_super  = isofs_put_super,
.statfs = isofs_statfs,
.remount_fs = isofs_remount,
+   .show_options   = generic_show_options,
 };
 
 
@@ -554,6 +555,8 @@ static int isofs_fill_super(struct super
int table, error = -EINVAL;
unsigned int vol_desc_start;
 
+   save_mount_options(s, data);
+
sbi = kzalloc(sizeof(*sbi), GFP_KERNEL);
if (!sbi)
return -ENOMEM;

--
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[patch 21/26] mount options: partially fix nfs

2008-01-24 Thread Miklos Szeredi

From: Miklos Szeredi [EMAIL PROTECTED]

Add posix, bsize=, namelen= options to /proc/mounts for nfs
filesystems.

Document several other options that are still missing.

Signed-off-by: Miklos Szeredi [EMAIL PROTECTED]
---

Index: linux/fs/nfs/super.c
===
--- linux.orig/fs/nfs/super.c   2008-01-19 11:56:34.0 +0100
+++ linux/fs/nfs/super.c2008-01-21 20:41:30.0 +0100
@@ -449,6 +449,7 @@ static void nfs_show_mount_options(struc
} nfs_info[] = {
{ NFS_MOUNT_SOFT, ,soft, ,hard },
{ NFS_MOUNT_INTR, ,intr, ,nointr },
+   { NFS_MOUNT_POSIX, ,posix,  },
{ NFS_MOUNT_NOCTO, ,nocto,  },
{ NFS_MOUNT_NOAC, ,noac,  },
{ NFS_MOUNT_NONLM, ,nolock,  },
@@ -459,10 +460,17 @@ static void nfs_show_mount_options(struc
};
const struct proc_nfs_info *nfs_infop;
struct nfs_client *clp = nfss-nfs_client;
+   unsigned int default_namelen =
+   clp-rpc_ops-version == 4 ? NFS4_MAXNAMLEN :
+   clp-rpc_ops-version == 3 ? NFS3_MAXNAMLEN : NFS2_MAXNAMLEN;
 
seq_printf(m, ,vers=%d, clp-rpc_ops-version);
seq_printf(m, ,rsize=%d, nfss-rsize);
seq_printf(m, ,wsize=%d, nfss-wsize);
+   if (nfss-bsize != 0)
+   seq_printf(m, ,bsize=%d, nfss-bsize);
+   if (nfss-namelen != default_namelen)
+   seq_printf(m, ,namelen=%d, nfss-namelen);
if (nfss-acregmin != 3*HZ || showdefaults)
seq_printf(m, ,acregmin=%d, nfss-acregmin/HZ);
if (nfss-acregmax != 60*HZ || showdefaults)
@@ -482,6 +490,18 @@ static void nfs_show_mount_options(struc
seq_printf(m, ,timeo=%lu, 10U * nfss-client-cl_timeout-to_initval 
/ HZ);
seq_printf(m, ,retrans=%u, nfss-client-cl_timeout-to_retries);
seq_printf(m, ,sec=%s, 
nfs_pseudoflavour_to_name(nfss-client-cl_auth-au_flavor));
+
+   /*
+* Missing options:
+* port=
+* mountport=
+* mountvers=
+* mountproto=
+* addr=
+* clientaddr=
+* mounthost=
+* mountaddr=
+*/
 }
 
 /*

--
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[patch 23/26] mount options: fix spufs

2008-01-24 Thread Miklos Szeredi

From: Miklos Szeredi [EMAIL PROTECTED]

Add a .show_options super operation to spufs.

Use generic_show_options() and save the complete option string in
spufs_fill_super().

Signed-off-by: Miklos Szeredi [EMAIL PROTECTED]
---

Index: linux/arch/powerpc/platforms/cell/spufs/inode.c
===
--- linux.orig/arch/powerpc/platforms/cell/spufs/inode.c2008-01-17 
19:00:52.0 +0100
+++ linux/arch/powerpc/platforms/cell/spufs/inode.c 2008-01-23 
23:44:36.0 +0100
@@ -744,8 +744,11 @@ spufs_fill_super(struct super_block *sb,
.statfs = simple_statfs,
.delete_inode = spufs_delete_inode,
.drop_inode = generic_delete_inode,
+   .show_options = generic_show_options,
};
 
+   save_mount_options(sb, data);
+
sb-s_maxbytes = MAX_LFS_FILESIZE;
sb-s_blocksize = PAGE_CACHE_SIZE;
sb-s_blocksize_bits = PAGE_CACHE_SHIFT;

--
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[patch 07/26] mount options: fix autofs

2008-01-24 Thread Miklos Szeredi

From: Miklos Szeredi [EMAIL PROTECTED]

Add a .show_options super operation to autofs.

Use generic_show_options() and save the complete option string in
autofs_fill_super().

Signed-off-by: Miklos Szeredi [EMAIL PROTECTED]
---

Index: linux/fs/autofs/inode.c
===
--- linux.orig/fs/autofs/inode.c2008-01-17 19:00:54.0 +0100
+++ linux/fs/autofs/inode.c 2008-01-24 11:16:30.0 +0100
@@ -54,6 +54,7 @@ out_kill_sb:
 
 static const struct super_operations autofs_sops = {
.statfs = simple_statfs,
+   .show_options   = generic_show_options,
 };
 
 enum {Opt_err, Opt_fd, Opt_uid, Opt_gid, Opt_pgrp, Opt_minproto, Opt_maxproto};
@@ -140,6 +141,8 @@ int autofs_fill_super(struct super_block
int minproto, maxproto;
pid_t pgid;
 
+   save_mount_options(s, data);
+
sbi = kzalloc(sizeof(*sbi), GFP_KERNEL);
if (!sbi)
goto fail_unlock;

--
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[patch 24/26] mount options: fix tmpfs

2008-01-24 Thread Miklos Szeredi

From: Miklos Szeredi [EMAIL PROTECTED]

Add .show_options super operation to tmpfs.

Signed-off-by: Miklos Szeredi [EMAIL PROTECTED]
---

Index: linux/mm/shmem.c
===
--- linux.orig/mm/shmem.c   2008-01-21 21:20:04.0 +0100
+++ linux/mm/shmem.c2008-01-21 21:30:04.0 +0100
@@ -49,6 +49,7 @@
 #include linux/ctype.h
 #include linux/migrate.h
 #include linux/highmem.h
+#include linux/seq_file.h
 
 #include asm/uaccess.h
 #include asm/div64.h
@@ -198,7 +199,7 @@ static DEFINE_MUTEX(shmem_swaplist_mutex
 static void shmem_free_blocks(struct inode *inode, long pages)
 {
struct shmem_sb_info *sbinfo = SHMEM_SB(inode-i_sb);
-   if (sbinfo-max_blocks) {
+   if (sbinfo-config.max_blocks) {
spin_lock(sbinfo-stat_lock);
sbinfo-free_blocks += pages;
inode-i_blocks -= pages*BLOCKS_PER_PAGE;
@@ -209,7 +210,7 @@ static void shmem_free_blocks(struct ino
 static int shmem_reserve_inode(struct super_block *sb)
 {
struct shmem_sb_info *sbinfo = SHMEM_SB(sb);
-   if (sbinfo-max_inodes) {
+   if (sbinfo-config.max_inodes) {
spin_lock(sbinfo-stat_lock);
if (!sbinfo-free_inodes) {
spin_unlock(sbinfo-stat_lock);
@@ -224,7 +225,7 @@ static int shmem_reserve_inode(struct su
 static void shmem_free_inode(struct super_block *sb)
 {
struct shmem_sb_info *sbinfo = SHMEM_SB(sb);
-   if (sbinfo-max_inodes) {
+   if (sbinfo-config.max_inodes) {
spin_lock(sbinfo-stat_lock);
sbinfo-free_inodes++;
spin_unlock(sbinfo-stat_lock);
@@ -388,7 +389,7 @@ static swp_entry_t *shmem_swp_alloc(stru
 * page (and perhaps indirect index pages) yet to allocate:
 * a waste to allocate index if we cannot allocate data.
 */
-   if (sbinfo-max_blocks) {
+   if (sbinfo-config.max_blocks) {
spin_lock(sbinfo-stat_lock);
if (sbinfo-free_blocks = 1) {
spin_unlock(sbinfo-stat_lock);
@@ -1338,7 +1339,7 @@ repeat:
} else {
shmem_swp_unmap(entry);
sbinfo = SHMEM_SB(inode-i_sb);
-   if (sbinfo-max_blocks) {
+   if (sbinfo-config.max_blocks) {
spin_lock(sbinfo-stat_lock);
if (sbinfo-free_blocks == 0 ||
shmem_acct_block(info-flags)) {
@@ -1519,8 +1520,9 @@ shmem_get_inode(struct super_block *sb, 
case S_IFREG:
inode-i_op = shmem_inode_operations;
inode-i_fop = shmem_file_operations;
-   mpol_shared_policy_init(info-policy, sbinfo-policy,
-   sbinfo-policy_nodes);
+   mpol_shared_policy_init(info-policy,
+   sbinfo-config.policy,
+   sbinfo-config.policy_nodes);
break;
case S_IFDIR:
inc_nlink(inode);
@@ -1720,12 +1722,12 @@ static int shmem_statfs(struct dentry *d
buf-f_bsize = PAGE_CACHE_SIZE;
buf-f_namelen = NAME_MAX;
spin_lock(sbinfo-stat_lock);
-   if (sbinfo-max_blocks) {
-   buf-f_blocks = sbinfo-max_blocks;
+   if (sbinfo-config.max_blocks) {
+   buf-f_blocks = sbinfo-config.max_blocks;
buf-f_bavail = buf-f_bfree = sbinfo-free_blocks;
}
-   if (sbinfo-max_inodes) {
-   buf-f_files = sbinfo-max_inodes;
+   if (sbinfo-config.max_inodes) {
+   buf-f_files = sbinfo-config.max_inodes;
buf-f_ffree = sbinfo-free_inodes;
}
/* else leave those fields 0 like simple_statfs */
@@ -2077,9 +2079,8 @@ static const struct export_operations sh
.fh_to_dentry   = shmem_fh_to_dentry,
 };
 
-static int shmem_parse_options(char *options, int *mode, uid_t *uid,
-   gid_t *gid, unsigned long *blocks, unsigned long *inodes,
-   int *policy, nodemask_t *policy_nodes)
+static int shmem_parse_options(char *options, struct shmem_config *config,
+  bool remount)
 {
char *this_char, *value, *rest;
 
@@ -2122,35 +2123,43 @@ static int shmem_parse_options(char *opt
}
if (*rest)
goto bad_val;
-   *blocks = DIV_ROUND_UP(size, PAGE_CACHE_SIZE);
+   config-max_blocks =
+   DIV_ROUND_UP(size, PAGE_CACHE_SIZE);
+   config-max_blocks_changed = 1;
} else if (!strcmp(this_char,nr_blocks)) {
-   *blocks = memparse(value,rest);
+

[patch 25/26] mount options: fix udf

2008-01-24 Thread Miklos Szeredi

From: Miklos Szeredi [EMAIL PROTECTED]

Add a .show_options super operation to udf.

Signed-off-by: Miklos Szeredi [EMAIL PROTECTED]
---

Index: linux/fs/udf/super.c
===
--- linux.orig/fs/udf/super.c   2008-01-24 13:48:37.0 +0100
+++ linux/fs/udf/super.c2008-01-24 15:58:21.0 +0100
@@ -53,6 +53,8 @@
 #include linux/vfs.h
 #include linux/vmalloc.h
 #include linux/errno.h
+#include linux/mount.h
+#include linux/seq_file.h
 #include asm/byteorder.h
 
 #include linux/udf_fs.h
@@ -71,6 +73,8 @@
 #define VDS_POS_TERMINATING_DESC   6
 #define VDS_POS_LENGTH 7
 
+#define UDF_DEFAULT_BLOCKSIZE 2048
+
 static char error_buf[1024];
 
 /* These are the meat - everything else is stuffing */
@@ -95,6 +99,7 @@ static void udf_open_lvid(struct super_b
 static void udf_close_lvid(struct super_block *);
 static unsigned int udf_count_free(struct super_block *);
 static int udf_statfs(struct dentry *, struct kstatfs *);
+static int udf_show_options(struct seq_file *, struct vfsmount *);
 
 struct logicalVolIntegrityDescImpUse *udf_sb_lvidiu(struct udf_sb_info *sbi)
 {
@@ -181,6 +186,7 @@ static const struct super_operations udf
.write_super= udf_write_super,
.statfs = udf_statfs,
.remount_fs = udf_remount_fs,
+   .show_options   = udf_show_options,
 };
 
 struct udf_options {
@@ -247,6 +253,56 @@ static int udf_sb_alloc_partition_maps(s
return 0;
 }
 
+static int udf_show_options(struct seq_file *seq, struct vfsmount *mnt)
+{
+   struct super_block *sb = mnt-mnt_sb;
+   struct udf_sb_info *sbi = UDF_SB(sb);
+
+   if (!UDF_QUERY_FLAG(sb, UDF_FLAG_STRICT))
+   seq_puts(seq, ,nostrict);
+   if (sb-s_blocksize != UDF_DEFAULT_BLOCKSIZE)
+   seq_printf(seq, ,bs=%lu, sb-s_blocksize);
+   if (UDF_QUERY_FLAG(sb, UDF_FLAG_UNHIDE))
+   seq_puts(seq, ,unhide);
+   if (UDF_QUERY_FLAG(sb, UDF_FLAG_UNDELETE))
+   seq_puts(seq, ,undelete);
+   if (!UDF_QUERY_FLAG(sb, UDF_FLAG_USE_AD_IN_ICB))
+   seq_puts(seq, ,noadinicb);
+   if (UDF_QUERY_FLAG(sb, UDF_FLAG_USE_SHORT_AD))
+   seq_puts(seq, ,shortad);
+   if (UDF_QUERY_FLAG(sb, UDF_FLAG_UID_FORGET))
+   seq_puts(seq, ,uid=forget);
+   if (UDF_QUERY_FLAG(sb, UDF_FLAG_UID_IGNORE))
+   seq_puts(seq, ,uid=ignore);
+   if (UDF_QUERY_FLAG(sb, UDF_FLAG_GID_FORGET))
+   seq_puts(seq, ,gid=forget);
+   if (UDF_QUERY_FLAG(sb, UDF_FLAG_GID_IGNORE))
+   seq_puts(seq, ,gid=ignore);
+   if (UDF_QUERY_FLAG(sb, UDF_FLAG_UID_SET))
+   seq_printf(seq, ,uid=%u, sbi-s_uid);
+   if (UDF_QUERY_FLAG(sb, UDF_FLAG_GID_SET))
+   seq_printf(seq, ,gid=%u, sbi-s_gid);
+   if (sbi-s_umask != 0)
+   seq_printf(seq, ,umask=%o, sbi-s_umask);
+   if (UDF_QUERY_FLAG(sb, UDF_FLAG_SESSION_SET))
+   seq_printf(seq, ,session=%u, sbi-s_session);
+   if (UDF_QUERY_FLAG(sb, UDF_FLAG_LASTBLOCK_SET))
+   seq_printf(seq, ,lastblock=%u, sbi-s_last_block);
+   /* is this correct? */
+   if (sbi-s_anchor[2] != 0)
+   seq_printf(seq, ,anchor=%u, sbi-s_anchor[2]);
+   /*
+* volume, partition, fileset and rootdir seem to be ignored
+* currently
+*/
+   if (UDF_QUERY_FLAG(sb, UDF_FLAG_UTF8))
+   seq_puts(seq, ,utf8);
+   if (UDF_QUERY_FLAG(sb, UDF_FLAG_NLS_MAP)  sbi-s_nls_map)
+   seq_printf(seq, ,iocharset=%s, sbi-s_nls_map-charset);
+
+   return 0;
+}
+
 /*
  * udf_parse_options
  *
@@ -339,13 +395,14 @@ static match_table_t tokens = {
{Opt_err,   NULL}
 };
 
-static int udf_parse_options(char *options, struct udf_options *uopt)
+static int udf_parse_options(char *options, struct udf_options *uopt,
+bool remount)
 {
char *p;
int option;
 
uopt-novrs = 0;
-   uopt-blocksize = 2048;
+   uopt-blocksize = UDF_DEFAULT_BLOCKSIZE;
uopt-partition = 0x;
uopt-session = 0x;
uopt-lastblock = 0;
@@ -415,11 +472,15 @@ static int udf_parse_options(char *optio
if (match_int(args, option))
return 0;
uopt-session = option;
+   if (!remount)
+   uopt-flags |= (1  UDF_FLAG_SESSION_SET);
break;
case Opt_lastblock:
if (match_int(args, option))
return 0;
uopt-lastblock = option;
+   if (!remount)
+   uopt-flags |= (1  UDF_FLAG_LASTBLOCK_SET);
break;
case Opt_anchor:
if (match_int(args, option))
@@ -497,7 +558,7 @@ static

[patch 26/26] mount options: fix usbfs

2008-01-24 Thread Miklos Szeredi

From: Miklos Szeredi [EMAIL PROTECTED]

Add a .show_options super operation to usbfs.

Signed-off-by: Miklos Szeredi [EMAIL PROTECTED]
---

Index: linux/drivers/usb/core/inode.c
===
--- linux.orig/drivers/usb/core/inode.c 2008-01-24 13:48:37.0 +0100
+++ linux/drivers/usb/core/inode.c  2008-01-24 16:00:03.0 +0100
@@ -38,10 +38,15 @@
 #include linux/usbdevice_fs.h
 #include linux/parser.h
 #include linux/notifier.h
+#include linux/seq_file.h
 #include asm/byteorder.h
 #include usb.h
 #include hcd.h
 
+#define USBFS_DEFAULT_DEVMODE (S_IWUSR | S_IRUGO)
+#define USBFS_DEFAULT_BUSMODE (S_IXUGO | S_IRUGO)
+#define USBFS_DEFAULT_LISTMODE S_IRUGO
+
 static struct super_operations usbfs_ops;
 static const struct file_operations default_file_operations;
 static struct vfsmount *usbfs_mount;
@@ -57,9 +62,33 @@ static uid_t listuid;/* = 0 */
 static gid_t devgid;   /* = 0 */
 static gid_t busgid;   /* = 0 */
 static gid_t listgid;  /* = 0 */
-static umode_t devmode = S_IWUSR | S_IRUGO;
-static umode_t busmode = S_IXUGO | S_IRUGO;
-static umode_t listmode = S_IRUGO;
+static umode_t devmode = USBFS_DEFAULT_DEVMODE;
+static umode_t busmode = USBFS_DEFAULT_BUSMODE;
+static umode_t listmode = USBFS_DEFAULT_LISTMODE;
+
+static int usbfs_show_options(struct seq_file *seq, struct vfsmount *mnt)
+{
+   if (devuid != 0)
+   seq_printf(seq, ,devuid=%u, devuid);
+   if (devgid != 0)
+   seq_printf(seq, ,devgid=%u, devgid);
+   if (devmode != USBFS_DEFAULT_DEVMODE)
+   seq_printf(seq, ,devmode=%o, devmode);
+   if (busuid != 0)
+   seq_printf(seq, ,busuid=%u, busuid);
+   if (busgid != 0)
+   seq_printf(seq, ,busgid=%u, busgid);
+   if (busmode != USBFS_DEFAULT_BUSMODE)
+   seq_printf(seq, ,busmode=%o, busmode);
+   if (listuid != 0)
+   seq_printf(seq, ,listuid=%u, listuid);
+   if (listgid != 0)
+   seq_printf(seq, ,listgid=%u, listgid);
+   if (listmode != USBFS_DEFAULT_LISTMODE)
+   seq_printf(seq, ,listmode=%o, listmode);
+
+   return 0;
+}
 
 enum {
Opt_devuid, Opt_devgid, Opt_devmode,
@@ -93,9 +122,9 @@ static int parse_options(struct super_bl
devgid = 0;
busgid = 0;
listgid = 0;
-   devmode = S_IWUSR | S_IRUGO;
-   busmode = S_IXUGO | S_IRUGO;
-   listmode = S_IRUGO;
+   devmode = USBFS_DEFAULT_DEVMODE;
+   busmode = USBFS_DEFAULT_BUSMODE;
+   listmode = USBFS_DEFAULT_LISTMODE;
 
while ((p = strsep(data, ,)) != NULL) {
substring_t args[MAX_OPT_ARGS];
@@ -418,6 +447,7 @@ static struct super_operations usbfs_ops
.statfs =   simple_statfs,
.drop_inode =   generic_delete_inode,
.remount_fs =   remount,
+   .show_options = usbfs_show_options,
 };
 
 static int usbfs_fill_super(struct super_block *sb, void *data, int silent)

--
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[patch 22/26] mount options: fix reiserfs

2008-01-24 Thread Miklos Szeredi

From: Miklos Szeredi [EMAIL PROTECTED]

Add a .show_options super operation to reiserfs.

Use generic_show_options() and save the complete option string in
reiserfs_fill_super() and reiserfs_remount().

Signed-off-by: Miklos Szeredi [EMAIL PROTECTED]
---

Index: linux/fs/reiserfs/super.c
===
--- linux.orig/fs/reiserfs/super.c  2008-01-17 19:00:55.0 +0100
+++ linux/fs/reiserfs/super.c   2008-01-22 21:20:33.0 +0100
@@ -617,6 +617,7 @@ static const struct super_operations rei
.unlockfs = reiserfs_unlockfs,
.statfs = reiserfs_statfs,
.remount_fs = reiserfs_remount,
+   .show_options = generic_show_options,
 #ifdef CONFIG_QUOTA
.quota_read = reiserfs_quota_read,
.quota_write = reiserfs_quota_write,
@@ -1138,6 +1139,7 @@ static int reiserfs_remount(struct super
unsigned long safe_mask = 0;
unsigned int commit_max_age = (unsigned int)-1;
struct reiserfs_journal *journal = SB_JOURNAL(s);
+   char *new_opts = kstrdup(arg, GFP_KERNEL);
int err;
 #ifdef CONFIG_QUOTA
int i;
@@ -1153,7 +1155,8 @@ static int reiserfs_remount(struct super
REISERFS_SB(s)-s_qf_names[i] = NULL;
}
 #endif
-   return -EINVAL;
+   err = -EINVAL;
+   goto out_err;
}
 
handle_attrs(s);
@@ -1191,9 +1194,9 @@ static int reiserfs_remount(struct super
}
 
if (blocks) {
-   int rc = reiserfs_resize(s, blocks);
-   if (rc != 0)
-   return rc;
+   err = reiserfs_resize(s, blocks);
+   if (err != 0)
+   goto out_err;
}
 
if (*mount_flags  MS_RDONLY) {
@@ -1201,16 +1204,16 @@ static int reiserfs_remount(struct super
/* remount read-only */
if (s-s_flags  MS_RDONLY)
/* it is read-only already */
-   return 0;
+   goto out_ok;
/* try to remount file system with read-only permissions */
if (sb_umount_state(rs) == REISERFS_VALID_FS
|| REISERFS_SB(s)-s_mount_state != REISERFS_VALID_FS) {
-   return 0;
+   goto out_ok;
}
 
err = journal_begin(th, s, 10);
if (err)
-   return err;
+   goto out_err;
 
/* Mounting a rw partition read-only. */
reiserfs_prepare_for_journal(s, SB_BUFFER_WITH_SB(s), 1);
@@ -1220,11 +1223,13 @@ static int reiserfs_remount(struct super
/* remount read-write */
if (!(s-s_flags  MS_RDONLY)) {
reiserfs_xattr_init(s, *mount_flags);
-   return 0;   /* We are read-write already */
+   goto out_ok;/* We are read-write already */
}
 
-   if (reiserfs_is_journal_aborted(journal))
-   return journal-j_errno;
+   if (reiserfs_is_journal_aborted(journal)) {
+   err = journal-j_errno;
+   goto out_err;
+   }
 
handle_data_mode(s, mount_options);
handle_barrier_mode(s, mount_options);
@@ -1232,7 +1237,7 @@ static int reiserfs_remount(struct super
s-s_flags = ~MS_RDONLY;   /* now it is safe to call 
journal_begin */
err = journal_begin(th, s, 10);
if (err)
-   return err;
+   goto out_err;
 
/* Mount a partition which is read-only, read-write */
reiserfs_prepare_for_journal(s, SB_BUFFER_WITH_SB(s), 1);
@@ -1247,7 +1252,7 @@ static int reiserfs_remount(struct super
SB_JOURNAL(s)-j_must_wait = 1;
err = journal_end(th, s, 10);
if (err)
-   return err;
+   goto out_err;
s-s_dirt = 0;
 
if (!(*mount_flags  MS_RDONLY)) {
@@ -1255,7 +1260,14 @@ static int reiserfs_remount(struct super
reiserfs_xattr_init(s, *mount_flags);
}
 
+out_ok:
+   kfree(s-s_options);
+   s-s_options = new_opts;
return 0;
+
+out_err:
+   kfree(new_opts);
+   return err;
 }
 
 static int read_super_block(struct super_block *s, int offset)
@@ -1559,6 +1571,8 @@ static int reiserfs_fill_super(struct su
struct reiserfs_sb_info *sbi;
int errval = -EINVAL;
 
+   save_mount_options(s, data);
+
sbi = kzalloc(sizeof(struct reiserfs_sb_info), GFP_KERNEL);
if (!sbi) {
errval = -ENOMEM;

--
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[patch 20/26] mount options: fix ncpfs

2008-01-24 Thread Miklos Szeredi

From: Miklos Szeredi [EMAIL PROTECTED]

Add a .show_options super operation to ncpfs.

Small fix: add FS_BINARY_MOUNTDATA to the filesystem type flags, since
it can take binary data, as well as text (similarly to NFS).

Signed-off-by: Miklos Szeredi [EMAIL PROTECTED]
---

Index: linux/fs/ncpfs/inode.c
===
--- linux.orig/fs/ncpfs/inode.c 2008-01-24 13:48:37.0 +0100
+++ linux/fs/ncpfs/inode.c  2008-01-24 15:57:17.0 +0100
@@ -28,6 +28,8 @@
 #include linux/init.h
 #include linux/smp_lock.h
 #include linux/vfs.h
+#include linux/mount.h
+#include linux/seq_file.h
 
 #include linux/ncp_fs.h
 
@@ -36,9 +38,15 @@
 #include ncplib_kernel.h
 #include getopt.h
 
+#define NCP_DEFAULT_FILE_MODE 0600
+#define NCP_DEFAULT_DIR_MODE 0700
+#define NCP_DEFAULT_TIME_OUT 10
+#define NCP_DEFAULT_RETRY_COUNT 20
+
 static void ncp_delete_inode(struct inode *);
 static void ncp_put_super(struct super_block *);
 static int  ncp_statfs(struct dentry *, struct kstatfs *);
+static int  ncp_show_options(struct seq_file *, struct vfsmount *);
 
 static struct kmem_cache * ncp_inode_cachep;
 
@@ -96,6 +104,7 @@ static const struct super_operations ncp
.put_super  = ncp_put_super,
.statfs = ncp_statfs,
.remount_fs = ncp_remount,
+   .show_options   = ncp_show_options,
 };
 
 extern struct dentry_operations ncp_root_dentry_operations;
@@ -304,6 +313,37 @@ static void ncp_stop_tasks(struct ncp_se
flush_scheduled_work();
 }
 
+static int  ncp_show_options(struct seq_file *seq, struct vfsmount *mnt)
+{
+   struct ncp_server *server = NCP_SBP(mnt-mnt_sb);
+   unsigned int tmp;
+
+   if (server-m.uid != 0)
+   seq_printf(seq, ,uid=%u, server-m.uid);
+   if (server-m.gid != 0)
+   seq_printf(seq, ,gid=%u, server-m.gid);
+   if (server-m.mounted_uid != 0)
+   seq_printf(seq, ,owner=%u, server-m.mounted_uid);
+   tmp = server-m.file_mode  S_IALLUGO;
+   if (tmp != NCP_DEFAULT_FILE_MODE)
+   seq_printf(seq, ,mode=0%o, tmp);
+   tmp = server-m.dir_mode  S_IALLUGO;
+   if (tmp != NCP_DEFAULT_DIR_MODE)
+   seq_printf(seq, ,dirmode=0%o, tmp);
+   if (server-m.time_out != NCP_DEFAULT_TIME_OUT * HZ / 100) {
+   tmp = server-m.time_out * 100 / HZ;
+   seq_printf(seq, ,timeout=%u, tmp);
+   }
+   if (server-m.retry_count != NCP_DEFAULT_RETRY_COUNT)
+   seq_printf(seq, ,retry=%u, server-m.retry_count);
+   if (server-m.flags != 0)
+   seq_printf(seq, ,flags=%lu, server-m.flags);
+   if (server-m.wdog_pid != NULL)
+   seq_printf(seq, ,wdogpid=%u, pid_vnr(server-m.wdog_pid));
+
+   return 0;
+}
+
 static const struct ncp_option ncp_opts[] = {
{ uid,OPT_INT,'u' },
{ gid,OPT_INT,'g' },
@@ -331,12 +371,12 @@ static int ncp_parse_options(struct ncp_
data-mounted_uid = 0;
data-wdog_pid = NULL;
data-ncp_fd = ~0;
-   data-time_out = 10;
-   data-retry_count = 20;
+   data-time_out = NCP_DEFAULT_TIME_OUT;
+   data-retry_count = NCP_DEFAULT_RETRY_COUNT;
data-uid = 0;
data-gid = 0;
-   data-file_mode = 0600;
-   data-dir_mode = 0700;
+   data-file_mode = NCP_DEFAULT_FILE_MODE;
+   data-dir_mode = NCP_DEFAULT_DIR_MODE;
data-info_fd = -1;
data-mounted_vol[0] = 0;

@@ -982,6 +1022,7 @@ static struct file_system_type ncp_fs_ty
.name   = ncpfs,
.get_sb = ncp_get_sb,
.kill_sb= kill_anon_super,
+   .fs_flags   = FS_BINARY_MOUNTDATA,
 };
 
 static int __init init_ncp_fs(void)

--
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[patch 19/26] mount options: fix jfs

2008-01-24 Thread Miklos Szeredi

From: Miklos Szeredi [EMAIL PROTECTED]

Add iocharset= and errors= options to /proc/mounts for jfs
filesystems.

Signed-off-by: Miklos Szeredi [EMAIL PROTECTED]
---

Index: linux/fs/jfs/super.c
===
--- linux.orig/fs/jfs/super.c   2008-01-17 19:00:55.0 +0100
+++ linux/fs/jfs/super.c2008-01-21 19:39:30.0 +0100
@@ -602,6 +602,12 @@ static int jfs_show_options(struct seq_f
seq_printf(seq, ,umask=%03o, sbi-umask);
if (sbi-flag  JFS_NOINTEGRITY)
seq_puts(seq, ,nointegrity);
+   if (sbi-nls_tab)
+   seq_printf(seq, ,iocharset=%s, sbi-nls_tab-charset);
+   if (sbi-flag  JFS_ERR_CONTINUE)
+   seq_printf(seq, ,errors=continue);
+   if (sbi-flag  JFS_ERR_PANIC)
+   seq_printf(seq, ,errors=panic);
 
 #ifdef CONFIG_QUOTA
if (sbi-flag  JFS_USRQUOTA)

--
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[patch 17/26] mount options: fix hugetlbfs

2008-01-24 Thread Miklos Szeredi

From: Miklos Szeredi [EMAIL PROTECTED]

Add a .show_options super operation to hugetlbfs.

Use generic_show_options() and save the complete option string in
hugetlbfs_fill_super().

Signed-off-by: Miklos Szeredi [EMAIL PROTECTED]
---

Index: linux/fs/hugetlbfs/inode.c
===
--- linux.orig/fs/hugetlbfs/inode.c 2008-01-22 21:31:53.0 +0100
+++ linux/fs/hugetlbfs/inode.c  2008-01-22 21:32:20.0 +0100
@@ -734,6 +734,7 @@ static const struct super_operations hug
.delete_inode   = hugetlbfs_delete_inode,
.drop_inode = hugetlbfs_drop_inode,
.put_super  = hugetlbfs_put_super,
+   .show_options   = generic_show_options,
 };
 
 static int
@@ -817,6 +818,8 @@ hugetlbfs_fill_super(struct super_block 
struct hugetlbfs_config config;
struct hugetlbfs_sb_info *sbinfo;
 
+   save_mount_options(sb, data);
+
config.nr_blocks = -1; /* No limit on size by default */
config.nr_inodes = -1; /* No limit on number of inodes by default */
config.uid = current-fsuid;

--
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [patch 10/26] mount options: fix devpts

2008-01-24 Thread H. Peter Anvin


Miklos Szeredi wrote:

Also add minor fix: when parsing the mode option, mask with
S_IALLUGO instead of ~S_IFMT, which could leave unsed bits in the
mask.


umode_t is 16 bits, so it doesn't.  The change is still good, of course.


+   if (config.mode != DEVPTS_DEFAULT_MODE)
+   seq_printf(seq, ,mode=%03o, config.mode);


I would rather this be unconditional, than that it be conditional on 
something other than the user having specified it in the first place.


Other than that,

Acked-by: H. Peter Anvin [EMAIL PROTECTED]

-hpa

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [patch 07/26] mount options: fix autofs

2008-01-24 Thread H. Peter Anvin


Miklos Szeredi wrote:
[autofs patch]

Acked-by: H. Peter Anvin [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [patch 25/26] mount options: fix udf

2008-01-24 Thread Cyrill Gorcunov

[Miklos Szeredi - Thu, Jan 24, 2008 at 08:34:06PM +0100]
| From: Miklos Szeredi [EMAIL PROTECTED]
| 
| Add a .show_options super operation to udf.
| 
| Signed-off-by: Miklos Szeredi [EMAIL PROTECTED]
| ---
| 
| Index: linux/fs/udf/super.c
| ===
| --- linux.orig/fs/udf/super.c 2008-01-24 13:48:37.0 +0100
| +++ linux/fs/udf/super.c  2008-01-24 15:58:21.0 +0100
| @@ -53,6 +53,8 @@
|  #include linux/vfs.h
|  #include linux/vmalloc.h
|  #include linux/errno.h
| +#include linux/mount.h
| +#include linux/seq_file.h
|  #include asm/byteorder.h
|  
|  #include linux/udf_fs.h
| @@ -71,6 +73,8 @@
|  #define VDS_POS_TERMINATING_DESC 6
|  #define VDS_POS_LENGTH   7
|  
| +#define UDF_DEFAULT_BLOCKSIZE 2048
| +
|  static char error_buf[1024];
|  
|  /* These are the meat - everything else is stuffing */
| @@ -95,6 +99,7 @@ static void udf_open_lvid(struct super_b
|  static void udf_close_lvid(struct super_block *);
|  static unsigned int udf_count_free(struct super_block *);
|  static int udf_statfs(struct dentry *, struct kstatfs *);
| +static int udf_show_options(struct seq_file *, struct vfsmount *);
|  
|  struct logicalVolIntegrityDescImpUse *udf_sb_lvidiu(struct udf_sb_info *sbi)
|  {
| @@ -181,6 +186,7 @@ static const struct super_operations udf
|   .write_super= udf_write_super,
|   .statfs = udf_statfs,
|   .remount_fs = udf_remount_fs,
| + .show_options   = udf_show_options,
|  };
|  
|  struct udf_options {
| @@ -247,6 +253,56 @@ static int udf_sb_alloc_partition_maps(s
|   return 0;
|  }
|  
| +static int udf_show_options(struct seq_file *seq, struct vfsmount *mnt)
| +{
| + struct super_block *sb = mnt-mnt_sb;
| + struct udf_sb_info *sbi = UDF_SB(sb);
| +
| + if (!UDF_QUERY_FLAG(sb, UDF_FLAG_STRICT))
| + seq_puts(seq, ,nostrict);
| + if (sb-s_blocksize != UDF_DEFAULT_BLOCKSIZE)
| + seq_printf(seq, ,bs=%lu, sb-s_blocksize);
| + if (UDF_QUERY_FLAG(sb, UDF_FLAG_UNHIDE))
| + seq_puts(seq, ,unhide);
| + if (UDF_QUERY_FLAG(sb, UDF_FLAG_UNDELETE))
| + seq_puts(seq, ,undelete);
| + if (!UDF_QUERY_FLAG(sb, UDF_FLAG_USE_AD_IN_ICB))
| + seq_puts(seq, ,noadinicb);
| + if (UDF_QUERY_FLAG(sb, UDF_FLAG_USE_SHORT_AD))
| + seq_puts(seq, ,shortad);
| + if (UDF_QUERY_FLAG(sb, UDF_FLAG_UID_FORGET))
| + seq_puts(seq, ,uid=forget);
| + if (UDF_QUERY_FLAG(sb, UDF_FLAG_UID_IGNORE))
| + seq_puts(seq, ,uid=ignore);
| + if (UDF_QUERY_FLAG(sb, UDF_FLAG_GID_FORGET))
| + seq_puts(seq, ,gid=forget);
| + if (UDF_QUERY_FLAG(sb, UDF_FLAG_GID_IGNORE))
| + seq_puts(seq, ,gid=ignore);
| + if (UDF_QUERY_FLAG(sb, UDF_FLAG_UID_SET))
| + seq_printf(seq, ,uid=%u, sbi-s_uid);
| + if (UDF_QUERY_FLAG(sb, UDF_FLAG_GID_SET))
| + seq_printf(seq, ,gid=%u, sbi-s_gid);
| + if (sbi-s_umask != 0)
| + seq_printf(seq, ,umask=%o, sbi-s_umask);
| + if (UDF_QUERY_FLAG(sb, UDF_FLAG_SESSION_SET))
| + seq_printf(seq, ,session=%u, sbi-s_session);
| + if (UDF_QUERY_FLAG(sb, UDF_FLAG_LASTBLOCK_SET))
| + seq_printf(seq, ,lastblock=%u, sbi-s_last_block);
| + /* is this correct? */
| + if (sbi-s_anchor[2] != 0)
| + seq_printf(seq, ,anchor=%u, sbi-s_anchor[2]);
| + /*
| +  * volume, partition, fileset and rootdir seem to be ignored
| +  * currently
| +  */
| + if (UDF_QUERY_FLAG(sb, UDF_FLAG_UTF8))
| + seq_puts(seq, ,utf8);
| + if (UDF_QUERY_FLAG(sb, UDF_FLAG_NLS_MAP)  sbi-s_nls_map)
| + seq_printf(seq, ,iocharset=%s, sbi-s_nls_map-charset);
| +
| + return 0;
| +}
| +
|  /*
|   * udf_parse_options
|   *
| @@ -339,13 +395,14 @@ static match_table_t tokens = {
|   {Opt_err,   NULL}
|  };
|  
| -static int udf_parse_options(char *options, struct udf_options *uopt)
| +static int udf_parse_options(char *options, struct udf_options *uopt,
| +  bool remount)
|  {
|   char *p;
|   int option;
|  
|   uopt-novrs = 0;
| - uopt-blocksize = 2048;
| + uopt-blocksize = UDF_DEFAULT_BLOCKSIZE;
|   uopt-partition = 0x;
|   uopt-session = 0x;
|   uopt-lastblock = 0;
| @@ -415,11 +472,15 @@ static int udf_parse_options(char *optio
|   if (match_int(args, option))
|   return 0;
|   uopt-session = option;
| + if (!remount)
| + uopt-flags |= (1  UDF_FLAG_SESSION_SET);
|   break;
|   case Opt_lastblock:
|   if (match_int(args, option))
|   return 0;
|   uopt-lastblock = option;
| + if (!remount)
| +

Re: [patch 03/26] mount options: fix adfs

2008-01-24 Thread Russell King

On Thu, Jan 24, 2008 at 08:33:44PM +0100, Miklos Szeredi wrote:
 From: Miklos Szeredi [EMAIL PROTECTED]
 
 Add a .show_options super operation to adfs.
 
 Signed-off-by: Miklos Szeredi [EMAIL PROTECTED]

Thanks.

Acked-by: Russell King [EMAIL PROTECTED]

-- 
Russell King
 Linux kernel2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [patch 25/26] mount options: fix udf

2008-01-24 Thread Cyrill Gorcunov

[Miklos Szeredi - Thu, Jan 24, 2008 at 08:34:06PM +0100]
| From: Miklos Szeredi [EMAIL PROTECTED]
| 
| Add a .show_options super operation to udf.
| 
| Signed-off-by: Miklos Szeredi [EMAIL PROTECTED]
| ---
| 
| Index: linux/fs/udf/super.c
| ===
| --- linux.orig/fs/udf/super.c 2008-01-24 13:48:37.0 +0100
| +++ linux/fs/udf/super.c  2008-01-24 15:58:21.0 +0100
| @@ -53,6 +53,8 @@
|  #include linux/vfs.h
|  #include linux/vmalloc.h
|  #include linux/errno.h
| +#include linux/mount.h
| +#include linux/seq_file.h
|  #include asm/byteorder.h
|  
|  #include linux/udf_fs.h
| @@ -71,6 +73,8 @@
|  #define VDS_POS_TERMINATING_DESC 6
|  #define VDS_POS_LENGTH   7
|  
| +#define UDF_DEFAULT_BLOCKSIZE 2048
| +

thanks, that is good cleanup

|  static char error_buf[1024];
|  
|  /* These are the meat - everything else is stuffing */
| @@ -95,6 +99,7 @@ static void udf_open_lvid(struct super_b
|  static void udf_close_lvid(struct super_block *);
|  static unsigned int udf_count_free(struct super_block *);
|  static int udf_statfs(struct dentry *, struct kstatfs *);
| +static int udf_show_options(struct seq_file *, struct vfsmount *);
|  
|  struct logicalVolIntegrityDescImpUse *udf_sb_lvidiu(struct udf_sb_info *sbi)
|  {
| @@ -181,6 +186,7 @@ static const struct super_operations udf
|   .write_super= udf_write_super,
|   .statfs = udf_statfs,
|   .remount_fs = udf_remount_fs,
| + .show_options   = udf_show_options,
|  };
|  
|  struct udf_options {
| @@ -247,6 +253,56 @@ static int udf_sb_alloc_partition_maps(s
|   return 0;
|  }
|  
| +static int udf_show_options(struct seq_file *seq, struct vfsmount *mnt)
| +{
| + struct super_block *sb = mnt-mnt_sb;
| + struct udf_sb_info *sbi = UDF_SB(sb);
| +
| + if (!UDF_QUERY_FLAG(sb, UDF_FLAG_STRICT))
| + seq_puts(seq, ,nostrict);
| + if (sb-s_blocksize != UDF_DEFAULT_BLOCKSIZE)
| + seq_printf(seq, ,bs=%lu, sb-s_blocksize);
| + if (UDF_QUERY_FLAG(sb, UDF_FLAG_UNHIDE))
| + seq_puts(seq, ,unhide);
| + if (UDF_QUERY_FLAG(sb, UDF_FLAG_UNDELETE))
| + seq_puts(seq, ,undelete);
| + if (!UDF_QUERY_FLAG(sb, UDF_FLAG_USE_AD_IN_ICB))
| + seq_puts(seq, ,noadinicb);
| + if (UDF_QUERY_FLAG(sb, UDF_FLAG_USE_SHORT_AD))
| + seq_puts(seq, ,shortad);
| + if (UDF_QUERY_FLAG(sb, UDF_FLAG_UID_FORGET))
| + seq_puts(seq, ,uid=forget);
| + if (UDF_QUERY_FLAG(sb, UDF_FLAG_UID_IGNORE))
| + seq_puts(seq, ,uid=ignore);
| + if (UDF_QUERY_FLAG(sb, UDF_FLAG_GID_FORGET))
| + seq_puts(seq, ,gid=forget);
| + if (UDF_QUERY_FLAG(sb, UDF_FLAG_GID_IGNORE))
| + seq_puts(seq, ,gid=ignore);
| + if (UDF_QUERY_FLAG(sb, UDF_FLAG_UID_SET))
| + seq_printf(seq, ,uid=%u, sbi-s_uid);
| + if (UDF_QUERY_FLAG(sb, UDF_FLAG_GID_SET))
| + seq_printf(seq, ,gid=%u, sbi-s_gid);
| + if (sbi-s_umask != 0)
| + seq_printf(seq, ,umask=%o, sbi-s_umask);
| + if (UDF_QUERY_FLAG(sb, UDF_FLAG_SESSION_SET))
| + seq_printf(seq, ,session=%u, sbi-s_session);
| + if (UDF_QUERY_FLAG(sb, UDF_FLAG_LASTBLOCK_SET))
| + seq_printf(seq, ,lastblock=%u, sbi-s_last_block);
| + /* is this correct? */
| + if (sbi-s_anchor[2] != 0)
| + seq_printf(seq, ,anchor=%u, sbi-s_anchor[2]);

you know, I would prefer to use form UDF_SB_ANCHOR(sb)[2]
in sake of style unification but we should wait for Jan's
decision (i'm not the expert in this area ;)

| + /*
| +  * volume, partition, fileset and rootdir seem to be ignored
| +  * currently
| +  */
| + if (UDF_QUERY_FLAG(sb, UDF_FLAG_UTF8))
| + seq_puts(seq, ,utf8);
| + if (UDF_QUERY_FLAG(sb, UDF_FLAG_NLS_MAP)  sbi-s_nls_map)
| + seq_printf(seq, ,iocharset=%s, sbi-s_nls_map-charset);
| +
| + return 0;
| +}
| +
|  /*
|   * udf_parse_options
|   *
| @@ -339,13 +395,14 @@ static match_table_t tokens = {
|   {Opt_err,   NULL}
|  };
|  
| -static int udf_parse_options(char *options, struct udf_options *uopt)
| +static int udf_parse_options(char *options, struct udf_options *uopt,
| +  bool remount)
|  {
|   char *p;
|   int option;
|  
|   uopt-novrs = 0;
| - uopt-blocksize = 2048;
| + uopt-blocksize = UDF_DEFAULT_BLOCKSIZE;
|   uopt-partition = 0x;
|   uopt-session = 0x;
|   uopt-lastblock = 0;
| @@ -415,11 +472,15 @@ static int udf_parse_options(char *optio
|   if (match_int(args, option))
|   return 0;
|   uopt-session = option;
| + if (!remount)
| + uopt-flags |= (1  UDF_FLAG_SESSION_SET);
|   break;
|   case Opt_lastblock:
|   if

[RFC] ext3: per-process soft-syncing data=ordered mode

2008-01-24 Thread Al Boldi

Greetings!

data=ordered mode has proven reliable over the years, and it does this by 
ordering filedata flushes before metadata flushes.  But this sometimes 
causes contention in the order of a 10x slowdown for certain apps, either 
due to the misuse of fsync or due to inherent behaviour like db's, as well 
as inherent starvation issues exposed by the data=ordered mode.

data=writeback mode alleviates data=order mode slowdowns, but only works 
per-mount and is too dangerous to run as a default mode.

This RFC proposes to introduce a tunable which allows to disable fsync and 
changes ordered into writeback writeout on a per-process basis like this:

  echo 1  /proc/`pidof process`/softsync


Your comments are much welcome!


Thanks!

--
Al

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [patch 21/26] mount options: partially fix nfs

2008-01-24 Thread Chuck Lever


Hi Miklos-

Miklos Szeredi wrote:

From: Miklos Szeredi [EMAIL PROTECTED]

Add posix, bsize=, namelen= options to /proc/mounts for nfs
filesystems.

Document several other options that are still missing.


NFS lists only some options in /proc/mounts on purpose: only the 
essential options are mentioned there to keep clutter down.  The three 
you've added here are for all intents and purposes deprecated, which is 
why they are not supported.


NFS lists a more complete set of mount options for a mount point in 
/proc/self/mountstats.  See nfs_show_stats().


Since your cover letter does not explain why you are changing this code, 
can you refer me to a description of why you are doing this?


More below.


Signed-off-by: Miklos Szeredi [EMAIL PROTECTED]
---

Index: linux/fs/nfs/super.c
===
--- linux.orig/fs/nfs/super.c   2008-01-19 11:56:34.0 +0100
+++ linux/fs/nfs/super.c2008-01-21 20:41:30.0 +0100
@@ -449,6 +449,7 @@ static void nfs_show_mount_options(struc
} nfs_info[] = {
{ NFS_MOUNT_SOFT, ,soft, ,hard },
{ NFS_MOUNT_INTR, ,intr, ,nointr },
+   { NFS_MOUNT_POSIX, ,posix,  },
{ NFS_MOUNT_NOCTO, ,nocto,  },
{ NFS_MOUNT_NOAC, ,noac,  },
{ NFS_MOUNT_NONLM, ,nolock,  },
@@ -459,10 +460,17 @@ static void nfs_show_mount_options(struc
};
const struct proc_nfs_info *nfs_infop;
struct nfs_client *clp = nfss-nfs_client;
+   unsigned int default_namelen =
+   clp-rpc_ops-version == 4 ? NFS4_MAXNAMLEN :
+   clp-rpc_ops-version == 3 ? NFS3_MAXNAMLEN : NFS2_MAXNAMLEN;
 
 	seq_printf(m, ,vers=%d, clp-rpc_ops-version);

seq_printf(m, ,rsize=%d, nfss-rsize);
seq_printf(m, ,wsize=%d, nfss-wsize);
+   if (nfss-bsize != 0)
+   seq_printf(m, ,bsize=%d, nfss-bsize);
+   if (nfss-namelen != default_namelen)
+   seq_printf(m, ,namelen=%d, nfss-namelen);
if (nfss-acregmin != 3*HZ || showdefaults)
seq_printf(m, ,acregmin=%d, nfss-acregmin/HZ);
if (nfss-acregmax != 60*HZ || showdefaults)
@@ -482,6 +490,18 @@ static void nfs_show_mount_options(struc
seq_printf(m, ,timeo=%lu, 10U * nfss-client-cl_timeout-to_initval 
/ HZ);
seq_printf(m, ,retrans=%u, nfss-client-cl_timeout-to_retries);
seq_printf(m, ,sec=%s, 
nfs_pseudoflavour_to_name(nfss-client-cl_auth-au_flavor));
+
+   /*
+* Missing options:
+* port=


Probably should be supported.


+* addr=


This one is already supported; see nfs_show_options().


+* clientaddr=


This one isn't, and should be... would be useful for tracking down 
certain NFSv4 problems.



+* mounthost=
+* mountaddr=

 +   * mountport=
 +   * mountvers=
 +   * mountproto=

And these mount* options are for the kernel's new mount protocol client. 
 They aren't really useful for understanding steady-state NFS client 
behavior, they only effect mount-time behavior.
begin:vcard
fn:Chuck Lever
n:Lever;Chuck
org:Oracle Corporation;Corporate Architecture: Linux Projects Group
adr:;;1015 Granger Avenue;Ann Arbor;MI;48104;USA
email;internet:chuck dot lever at nospam oracle dot com
title:Principal Member of Staff
tel;work:+1 248 614 5091
x-mozilla-html:FALSE
version:2.1
end:vcard

Re: [patch 21/26] mount options: partially fix nfs

2008-01-24 Thread Trond Myklebust

On Thu, 2008-01-24 at 20:34 +0100, Miklos Szeredi wrote:
 plain text document attachment (nfs_opts.patch)
 From: Miklos Szeredi [EMAIL PROTECTED]
 
 Add posix, bsize=, namelen= options to /proc/mounts for nfs
 filesystems.
 
 Document several other options that are still missing.
 
 Signed-off-by: Miklos Szeredi [EMAIL PROTECTED]
 ---
 
 Index: linux/fs/nfs/super.c
 ===
 --- linux.orig/fs/nfs/super.c 2008-01-19 11:56:34.0 +0100
 +++ linux/fs/nfs/super.c  2008-01-21 20:41:30.0 +0100
 @@ -449,6 +449,7 @@ static void nfs_show_mount_options(struc
   } nfs_info[] = {
   { NFS_MOUNT_SOFT, ,soft, ,hard },
   { NFS_MOUNT_INTR, ,intr, ,nointr },
 + { NFS_MOUNT_POSIX, ,posix,  },
   { NFS_MOUNT_NOCTO, ,nocto,  },
   { NFS_MOUNT_NOAC, ,noac,  },
   { NFS_MOUNT_NONLM, ,nolock,  },
 @@ -459,10 +460,17 @@ static void nfs_show_mount_options(struc
   };
   const struct proc_nfs_info *nfs_infop;
   struct nfs_client *clp = nfss-nfs_client;
 + unsigned int default_namelen =
 + clp-rpc_ops-version == 4 ? NFS4_MAXNAMLEN :
 + clp-rpc_ops-version == 3 ? NFS3_MAXNAMLEN : NFS2_MAXNAMLEN;
   seq_printf(m, ,vers=%d, clp-rpc_ops-version);
   seq_printf(m, ,rsize=%d, nfss-rsize);
   seq_printf(m, ,wsize=%d, nfss-wsize);
 + if (nfss-bsize != 0)
 + seq_printf(m, ,bsize=%d, nfss-bsize);
 + if (nfss-namelen != default_namelen)
 + seq_printf(m, ,namelen=%d, nfss-namelen);

You really just want to look at the value of nfss-namelen. It should
always be set.

   if (nfss-acregmin != 3*HZ || showdefaults)
   seq_printf(m, ,acregmin=%d, nfss-acregmin/HZ);
   if (nfss-acregmax != 60*HZ || showdefaults)
 @@ -482,6 +490,18 @@ static void nfs_show_mount_options(struc
   seq_printf(m, ,timeo=%lu, 10U * nfss-client-cl_timeout-to_initval 
 / HZ);
   seq_printf(m, ,retrans=%u, nfss-client-cl_timeout-to_retries);
   seq_printf(m, ,sec=%s, 
 nfs_pseudoflavour_to_name(nfss-client-cl_auth-au_flavor));
 +
 + /*
 +  * Missing options:
 +  * port=
 +  * mountport=
 +  * mountvers=
 +  * mountproto=
 +  * addr=
 +  * clientaddr=
 +  * mounthost=
 +  * mountaddr=
 +  */

The new text mount interface actually does allow us to store these
values if we really do need to. That should be a separate patch,
however.

Trond
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 24/27] NFS: Use local caching [try #2]

2008-01-24 Thread Trond Myklebust


On Wed, 2008-01-23 at 17:22 +, David Howells wrote:
 The attached patch makes it possible for the NFS filesystem to make use of the
 network filesystem local caching service (FS-Cache).
 
 To be able to use this, an updated mount program is required.  This can be
 obtained from:
 
   http://people.redhat.com/steved/fscache/util-linux/
 
 To mount an NFS filesystem to use caching, add an fsc option to the mount:
 
   mount warthog:/ /a -o fsc

Nope. The new text-based mount code should just work. There should be no
need to also support cachefs via the legacy binary formant.

 Signed-off-by: David Howells [EMAIL PROTECTED]
 ---
 
  fs/nfs/Makefile   |1 
  fs/nfs/client.c   |5 +
  fs/nfs/file.c |   37 
  fs/nfs/fscache-def.c  |  289 +
  fs/nfs/fscache.c  |  391 
 +
  fs/nfs/fscache.h  |  148 +
  fs/nfs/inode.c|   47 +
  fs/nfs/read.c |   28 +++
  fs/nfs/super.c|3 
  fs/nfs/sysctl.c   |1 
  include/linux/nfs_fs.h|9 +
  include/linux/nfs_fs_sb.h |   18 ++
  12 files changed, 968 insertions(+), 9 deletions(-)
  create mode 100644 fs/nfs/fscache-def.c
  create mode 100644 fs/nfs/fscache.c
  create mode 100644 fs/nfs/fscache.h
 

This needs to be split up.

Scheduling an fscache write, retrieving pages from fscache, managing
fscache cache consistency, adding statistics are all examples of
completely separate tasks that should not be bunched together in a
single megapatch, particularly not since they touch core NFS code.

Trond
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 25/27] NFS: Configuration and mount option changes to enable local caching on NFS [try #2]

2008-01-24 Thread Trond Myklebust


On Wed, 2008-01-23 at 17:22 +, David Howells wrote:
 Changes to the kernel configuration defintions and to the NFS mount options to
 allow the local caching support added by the previous patch to be enabled.
 
 Signed-off-by: David Howells [EMAIL PROTECTED]
 ---
 
  fs/Kconfig|8 
  fs/nfs/client.c   |2 ++
  fs/nfs/internal.h |1 +
  fs/nfs/super.c|   14 ++
  4 files changed, 25 insertions(+), 0 deletions(-)
 
 
 diff --git a/fs/Kconfig b/fs/Kconfig
 index e95b11c..39b1981 100644
 --- a/fs/Kconfig
 +++ b/fs/Kconfig
 @@ -1650,6 +1650,14 @@ config NFS_V4
  
 If unsure, say N.
  
 +config NFS_FSCACHE
 + bool Provide NFS client caching support (EXPERIMENTAL)
 + depends on EXPERIMENTAL
 + depends on NFS_FS=m  FSCACHE || NFS_FS=y  FSCACHE=y
 + help
 +   Say Y here if you want NFS data to be cached locally on disc through
 +   the general filesystem cache manager
 +
  config NFS_DIRECTIO
   bool Allow direct I/O on NFS files
   depends on NFS_FS
 diff --git a/fs/nfs/client.c b/fs/nfs/client.c
 index bcdc5d0..92f9b84 100644
 --- a/fs/nfs/client.c
 +++ b/fs/nfs/client.c
 @@ -572,6 +572,7 @@ static int nfs_init_server(struct nfs_server *server,
  
   /* Initialise the client representation from the mount data */
   server-flags = data-flags  NFS_MOUNT_FLAGMASK;
 + server-options = data-options;
  
   if (data-rsize)
   server-rsize = nfs_block_size(data-rsize, NULL);
 @@ -931,6 +932,7 @@ static int nfs4_init_server(struct nfs_server *server,
   /* Initialise the client representation from the mount data */
   server-flags = data-flags  NFS_MOUNT_FLAGMASK;
   server-caps |= NFS_CAP_ATOMIC_OPEN;
 + server-options = data-options;
  
   if (data-rsize)
   server-rsize = nfs_block_size(data-rsize, NULL);
 diff --git a/fs/nfs/internal.h b/fs/nfs/internal.h
 index f3acf48..ef09e00 100644
 --- a/fs/nfs/internal.h
 +++ b/fs/nfs/internal.h
 @@ -35,6 +35,7 @@ struct nfs_parsed_mount_data {
   int acregmin, acregmax,
   acdirmin, acdirmax;
   int namlen;
 + unsigned intoptions;
   unsigned intbsize;
   unsigned intauth_flavor_len;
   rpc_authflavor_tauth_flavors[1];
 diff --git a/fs/nfs/super.c b/fs/nfs/super.c
 index 6dd628f..0542550 100644
 --- a/fs/nfs/super.c
 +++ b/fs/nfs/super.c
 @@ -74,6 +74,7 @@ enum {
   Opt_acl, Opt_noacl,
   Opt_rdirplus, Opt_nordirplus,
   Opt_sharecache, Opt_nosharecache,
 + Opt_fscache, Opt_nofscache,
  
   /* Mount options that take integer arguments */
   Opt_port,
 @@ -123,6 +124,8 @@ static match_table_t nfs_mount_option_tokens = {
   { Opt_nordirplus, nordirplus },
   { Opt_sharecache, sharecache },
   { Opt_nosharecache, nosharecache },
 + { Opt_fscache, fsc },
 + { Opt_nofscache, nofsc },
  
   { Opt_port, port=%u },
   { Opt_rsize, rsize=%u },
 @@ -459,6 +462,8 @@ static void nfs_show_mount_options(struct seq_file *m, 
 struct nfs_server *nfss,
   seq_printf(m, ,timeo=%lu, 10U * clp-retrans_timeo / HZ);
   seq_printf(m, ,retrans=%u, clp-retrans_count);
   seq_printf(m, ,sec=%s, 
 nfs_pseudoflavour_to_name(nfss-client-cl_auth-au_flavor));
 + if (nfss-options  NFS_OPTION_FSCACHE)
 + seq_printf(m, ,fsc);
  }
  
  /*
 @@ -697,6 +702,15 @@ static int nfs_parse_mount_options(char *raw,
   break;
   case Opt_nosharecache:
   mnt-flags |= NFS_MOUNT_UNSHARED;
 + mnt-options = ~NFS_OPTION_FSCACHE;
 + break;
 + case Opt_fscache:
 + /* sharing is mandatory with fscache */
 + mnt-options |= NFS_OPTION_FSCACHE;
 + mnt-flags = ~NFS_MOUNT_UNSHARED;
 + break;

This is confusing. If the mount options are incompatible, then it makes
more sense to return an EINVAL instead of silently turning one of them
off.

 + case Opt_nofscache:
 + mnt-options = ~NFS_OPTION_FSCACHE;
   break;
  
   case Opt_port:
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [patch 19/26] mount options: fix jfs

2008-01-24 Thread Dave Kleikamp

On Thu, 2008-01-24 at 20:34 +0100, Miklos Szeredi wrote:
 plain text document attachment (jfs_opts.patch)
 From: Miklos Szeredi [EMAIL PROTECTED]
 
 Add iocharset= and errors= options to /proc/mounts for jfs
 filesystems.
 
 Signed-off-by: Miklos Szeredi [EMAIL PROTECTED]

Acked-by: Dave Kleikamp [EMAIL PROTECTED]

Andrew,
Would you like me to add this to the jfs git tree, or would you like to
handle these patches as a set?

Thanks,
Shaggy

 ---
 
 Index: linux/fs/jfs/super.c
 ===
 --- linux.orig/fs/jfs/super.c 2008-01-17 19:00:55.0 +0100
 +++ linux/fs/jfs/super.c  2008-01-21 19:39:30.0 +0100
 @@ -602,6 +602,12 @@ static int jfs_show_options(struct seq_f
   seq_printf(seq, ,umask=%03o, sbi-umask);
   if (sbi-flag  JFS_NOINTEGRITY)
   seq_puts(seq, ,nointegrity);
 + if (sbi-nls_tab)
 + seq_printf(seq, ,iocharset=%s, sbi-nls_tab-charset);
 + if (sbi-flag  JFS_ERR_CONTINUE)
 + seq_printf(seq, ,errors=continue);
 + if (sbi-flag  JFS_ERR_PANIC)
 + seq_printf(seq, ,errors=panic);
  
  #ifdef CONFIG_QUOTA
   if (sbi-flag  JFS_USRQUOTA)
 
 --
-- 
David Kleikamp
IBM Linux Technology Center

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 23/27] NFS: Fix memory leak [try #2]

2008-01-24 Thread Trond Myklebust


On Wed, 2008-01-23 at 17:22 +, David Howells wrote:
 Fix a memory leak whereby multiple clientaddr=xxx mount options just overwrite
 the duplicated client_address option pointer, without freeing the old memory.
 
 Signed-off-by: David Howells [EMAIL PROTECTED]
 ---
 
  fs/nfs/super.c |1 +
  1 files changed, 1 insertions(+), 0 deletions(-)
 
 
 diff --git a/fs/nfs/super.c b/fs/nfs/super.c
 index 0b0c72a..7f5e747 100644
 --- a/fs/nfs/super.c
 +++ b/fs/nfs/super.c
 @@ -936,6 +936,7 @@ static int nfs_parse_mount_options(char *raw,
   string = match_strdup(args);
   if (string == NULL)
   goto out_nomem;
 + kfree(mnt-client_address);
   mnt-client_address = string;
   break;
   case Opt_mountaddr:

Thanks. This fix has already been applied to the NFS git tree.
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 24/27] NFS: Use local caching [try #2]

2008-01-24 Thread Chuck Lever


Some comments below.

This patch really ought to be broken into more manageable atomic changes 
to make it easier to review, and to provide more fine-grained 
explanation and rationalization for each specific change via individual 
patch descriptions.


David Howells wrote:

The attached patch makes it possible for the NFS filesystem to make use of the
network filesystem local caching service (FS-Cache).

To be able to use this, an updated mount program is required.  This can be
obtained from:

http://people.redhat.com/steved/fscache/util-linux/


This should no longer be necessary.  The latest mount.nfs subcommand 
from nfs-utils supports text-based mounts when running on kernels 2.6.23 
and later.



To mount an NFS filesystem to use caching, add an fsc option to the mount:

mount warthog:/ /a -o fsc


I hope you intend to provide updates to nfs(5) that describe the new 
mount options you introduce in this and later patches.  You don't 
mention it, but I assume that nofsc is the default behavior.



Signed-off-by: David Howells [EMAIL PROTECTED]
---

 fs/nfs/Makefile   |1 
 fs/nfs/client.c   |5 +

 fs/nfs/file.c |   37 
 fs/nfs/fscache-def.c  |  289 +
 fs/nfs/fscache.c  |  391 +
 fs/nfs/fscache.h  |  148 +
 fs/nfs/inode.c|   47 +
 fs/nfs/read.c |   28 +++
 fs/nfs/super.c|3 
 fs/nfs/sysctl.c   |1 
 include/linux/nfs_fs.h|9 +

 include/linux/nfs_fs_sb.h |   18 ++
 12 files changed, 968 insertions(+), 9 deletions(-)
 create mode 100644 fs/nfs/fscache-def.c
 create mode 100644 fs/nfs/fscache.c
 create mode 100644 fs/nfs/fscache.h


diff --git a/fs/nfs/Makefile b/fs/nfs/Makefile
index df0f41e..073d04c 100644
--- a/fs/nfs/Makefile
+++ b/fs/nfs/Makefile
@@ -16,3 +16,4 @@ nfs-$(CONFIG_NFS_V4)  += nfs4proc.o nfs4xdr.o nfs4state.o 
nfs4renewd.o \
   nfs4namespace.o
 nfs-$(CONFIG_NFS_DIRECTIO) += direct.o
 nfs-$(CONFIG_SYSCTL) += sysctl.o
+nfs-$(CONFIG_NFS_FSCACHE) += fscache.o fscache-def.o
diff --git a/fs/nfs/client.c b/fs/nfs/client.c
index a6f6254..bcdc5d0 100644
--- a/fs/nfs/client.c
+++ b/fs/nfs/client.c
@@ -43,6 +43,7 @@
 #include delegation.h
 #include iostat.h
 #include internal.h
+#include fscache.h
 
 #define NFSDBG_FACILITY		NFSDBG_CLIENT
 
@@ -139,6 +140,8 @@ static struct nfs_client *nfs_alloc_client(const char *hostname,

clp-cl_state = 1  NFS4CLNT_LEASE_EXPIRED;
 #endif
 
+	nfs_fscache_get_client_cookie(clp);

+
return clp;
 
 error_3:

@@ -170,6 +173,8 @@ static void nfs_free_client(struct nfs_client *clp)
 
 	nfs4_shutdown_client(clp);
 
+	nfs_fscache_release_client_cookie(clp);

+
/* -EIO all pending I/O */
if (!IS_ERR(clp-cl_rpcclient))
rpc_shutdown_client(clp-cl_rpcclient);
diff --git a/fs/nfs/file.c b/fs/nfs/file.c
index b3bb89f..d492cd7 100644
--- a/fs/nfs/file.c
+++ b/fs/nfs/file.c
@@ -35,6 +35,7 @@
 #include delegation.h
 #include internal.h
 #include iostat.h
+#include fscache.h
 
 #define NFSDBG_FACILITY		NFSDBG_FILE
 
@@ -352,22 +353,48 @@ static int nfs_write_end(struct file *file, struct address_space *mapping,

return status  0 ? status : copied;
 }
 
+/*

+ * Partially or wholly invalidate a page
+ * - Release the private state associated with a page if undergoing complete
+ *   page invalidation
+ * - Called if either PG_private or PG_fscache set on the page
+ * - Caller holds page lock
+ */


Add comments like this in a separate clean up patch.


 static void nfs_invalidate_page(struct page *page, unsigned long offset)
 {
if (offset != 0)
return;
/* Cancel any unstarted writes on this page */
nfs_wb_page_cancel(page-mapping-host, page);
+
+   nfs_fscache_invalidate_page(page, page-mapping-host);
 }
 
+/*

+ * Release the private state associated with a page
+ * - Called if either PG_private or PG_fscache set on the page
+ * - Caller holds page lock
+ * - Return true (may release) or false (may not)
+ */
 static int nfs_release_page(struct page *page, gfp_t gfp)
 {
/* If PagePrivate() is set, then the page is not freeable */
-   return 0;
+   if (PagePrivate(page))
+   return 0;
+   return nfs_fscache_release_page(page, gfp);
 }
 
+/*

+ * Attempt to clear the private state associated with a page when an error
+ * occurs that requires the cached contents of an inode to be written back or
+ * destroyed
+ * - Called if either PG_private or PG_fscache set on the page
+ * - Caller holds page lock
+ * - Return 0 if successful, -error otherwise
+ */
 static int nfs_launder_page(struct page *page)
 {
+   wait_on_page_fscache_write(page);
return nfs_wb_page(page-mapping-host, page);
 }
 
@@ -387,6 +414,11 @@ const struct address_space_operations nfs_file_aops = {

.launder_page =

Re: [RFC] ext3: per-process soft-syncing data=ordered mode

2008-01-24 Thread Diego Calleja

El Thu, 24 Jan 2008 23:36:00 +0300, Al Boldi [EMAIL PROTECTED] escribió:

 Greetings!
 
 data=ordered mode has proven reliable over the years, and it does this by 
 ordering filedata flushes before metadata flushes.  But this sometimes 
 causes contention in the order of a 10x slowdown for certain apps, either 
 due to the misuse of fsync or due to inherent behaviour like db's, as well 
 as inherent starvation issues exposed by the data=ordered mode.

There's a related bug in bugzilla: 
http://bugzilla.kernel.org/show_bug.cgi?id=9546

The diagnostic from Jan Kara is different though, but I think it may be the same
problem...

One process does data-intensive load. Thus in the ordered mode the
transaction is tiny but has tons of data buffers attached. If commit
happens, it takes a long time to sync all the data before the commit
can proceed... In the writeback mode, we don't wait for data buffers, in
the journal mode amount of data to be written is really limited by the
maximum size of a transaction and so we write by much smaller chunks
and better latency is thus ensured.


I'm hitting this bug too...it's surprising that there's not many people
reporting more bugs about this, because it's really annoying.


There's a patch by Jan Kara (that I'm including here because bugzilla didn't
include it and took me a while to find it) which I don't know if it's supposed 
to
fix the problem , but it'd be interesting to try:




Don't allow too much data buffers in a transaction.

diff --git a/fs/jbd/transaction.c b/fs/jbd/transaction.c
index 08ff6c7..e6f9dd6 100644
--- a/fs/jbd/transaction.c
+++ b/fs/jbd/transaction.c
@@ -163,7 +163,7 @@ repeat_locked:
spin_lock(transaction-t_handle_lock);
needed = transaction-t_outstanding_credits + nblocks;
 
-   if (needed  journal-j_max_transaction_buffers) {
+   if (needed  journal-j_max_transaction_buffers || 
atomic_read(transaction-t_data_buf_count)  32768) {
/*
 * If the current transaction is already too large, then start
 * to commit it: we can then go back and attach this handle to
@@ -1528,6 +1528,7 @@ static void __journal_temp_unlink_buffer(struct 
journal_head *jh)
return;
case BJ_SyncData:
list = transaction-t_sync_datalist;
+   atomic_dec(transaction-t_data_buf_count);
break;
case BJ_Metadata:
transaction-t_nr_buffers--;
@@ -1989,6 +1990,7 @@ void __journal_file_buffer(struct journal_head *jh,
return;
case BJ_SyncData:
list = transaction-t_sync_datalist;
+   atomic_inc(transaction-t_data_buf_count);
break;
case BJ_Metadata:
transaction-t_nr_buffers++;
diff --git a/include/linux/jbd.h b/include/linux/jbd.h
index d9ecd13..6dd284a 100644
--- a/include/linux/jbd.h
+++ b/include/linux/jbd.h
@@ -541,6 +541,12 @@ struct transaction_s
int t_outstanding_credits;
 
/*
+* Number of data buffers on t_sync_datalist attached to
+* the transaction.
+*/
+   atomic_tt_data_buf_count;
+
+   /*
 * Forward and backward links for the circular list of all transactions
 * awaiting checkpoint. [j_list_lock]
 */
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC] ext3: per-process soft-syncing data=ordered mode

2008-01-24 Thread Valdis . Kletnieks

On Thu, 24 Jan 2008 23:36:00 +0300, Al Boldi said:
 data=ordered mode has proven reliable over the years, and it does this by 
 ordering filedata flushes before metadata flushes.  But this sometimes 
 causes contention in the order of a 10x slowdown for certain apps, either 
 due to the misuse of fsync or due to inherent behaviour like db's, as well 
 as inherent starvation issues exposed by the data=ordered mode.

If they're misusing it, they should be fixed.  There should be a limit to
how much the kernel will do to reduce the pain of doing stupid things.

 This RFC proposes to introduce a tunable which allows to disable fsync and 
 changes ordered into writeback writeout on a per-process basis like this:

Well-written programs only call fsync() when they really do need the semantics
of fsync.  Disabling that is just *asking* for trouble.

From rfc2821:

6.1 Reliable Delivery and Replies by Email

   When the receiver-SMTP accepts a piece of mail (by sending a 250 OK
   message in response to DATA), it is accepting responsibility for
   delivering or relaying the message.  It must take this responsibility
   seriously.  It MUST NOT lose the message for frivolous reasons, such
   as because the host later crashes or because of a predictable
   resource shortage.

Some people really *do* think the CPU took a machine check and after replacing
the motherboard, the resulting fsck ate the file is a frivolous reason to
lose data.

But if you want to give them enough rope to shoot themselves in the foot with,
I'd suggest abusing LD_PRELOAD to replace the fsync() glibc code instead.  No
need to clutter the kernel with rope that can be (and has been) done in 
userspace.


pgpHZMspmQtf2.pgp
Description: PGP signature

Re: [patch 19/26] mount options: fix jfs

2008-01-24 Thread Andrew Morton

 On Thu, 24 Jan 2008 15:15:01 -0600 Dave Kleikamp [EMAIL PROTECTED] wrote:
 On Thu, 2008-01-24 at 20:34 +0100, Miklos Szeredi wrote:
  plain text document attachment (jfs_opts.patch)
  From: Miklos Szeredi [EMAIL PROTECTED]
  
  Add iocharset= and errors= options to /proc/mounts for jfs
  filesystems.
  
  Signed-off-by: Miklos Szeredi [EMAIL PROTECTED]
 
 Acked-by: Dave Kleikamp [EMAIL PROTECTED]
 
 Andrew,
 Would you like me to add this to the jfs git tree, or would you like to
 handle these patches as a set?
 

My usual algorithm here is to

1: queue all the patches and send the ones which have a maintainer to
   that maintainer until he merges it.

2: If the patches have a dependency upon (say) a VFS patch then I'll
   merge the VFS patch and will then goto 1.

I don't think this particular patch has a VFS depencency so sure, merge
away.  You'll probably see that I merged it anyway, but I'll drop it again
when I see it turn up in your tree (I used to resync with the git trees
at least daily, but I now do this far less frequently because it is such
torture because everyone is paddling in everyone else's puddle).
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC] Parallelize IO for e2fsck

2008-01-24 Thread Andreas Dilger

On Jan 24, 2008  18:32 +0100, Bodo Eggert wrote:
 I think a single, system-wide signal is the second-to worst solution: All
 applications (or the wrong one, if you select one) would free their caches
 and start to crawl, and either stay in this state or slowly increase their
 caches again until they get signaled again. And the signal would either
 come too early or too late. The userspace daemon could collect the weighted
 demand of memory from all applications and tell them how much to use.

Well, sending a few signals (maybe to the top 5 processes in the OOM killer
list) is still a LOT better than OOM-killing them without warning...  That
way important system processes could be taught to understand SIGDANGER and
maybe do something about it instead of being killed, and if Firefox and
other memory hungry processes flush some of their cache it is not fatal.

I wouldn't think that SIGDANGER means free all of your cache, since the
memory usage clearly wasn't a problem a few seconds previously, so as
an application writer I'd code it as flush the oldest 10% of my cache
or similar, and the kernel could send SIGDANGER again (or kill the real
offender) if the memory usage again becomes an issue.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [patch 26/26] mount options: fix usbfs

2008-01-24 Thread Greg KH

On Thu, Jan 24, 2008 at 08:34:07PM +0100, Miklos Szeredi wrote:
 From: Miklos Szeredi [EMAIL PROTECTED]
 
 Add a .show_options super operation to usbfs.
 
 Signed-off-by: Miklos Szeredi [EMAIL PROTECTED]

Looks good to me.  Do you want to take this through your tree, as it is
dependant on other changes, or do you want me to take this through the
USB tree?  Whatever is easier for you is fine for me.

thanks,

greg k-h
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [patch 19/26] mount options: fix jfs

2008-01-24 Thread Dave Kleikamp


On Thu, 2008-01-24 at 13:57 -0800, Andrew Morton wrote:

 My usual algorithm here is to
 
 1: queue all the patches and send the ones which have a maintainer to
that maintainer until he merges it.
 
 2: If the patches have a dependency upon (say) a VFS patch then I'll
merge the VFS patch and will then goto 1.
 
 I don't think this particular patch has a VFS depencency so sure, merge
 away.  You'll probably see that I merged it anyway, but I'll drop it again
 when I see it turn up in your tree (I used to resync with the git trees
 at least daily, but I now do this far less frequently because it is such
 torture because everyone is paddling in everyone else's puddle).

Merged.  Thanks.

Shaggy
-- 
David Kleikamp
IBM Linux Technology Center

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC] Parallelize IO for e2fsck

2008-01-24 Thread Adrian Bunk

On Thu, Jan 24, 2008 at 06:32:15PM +0100, Bodo Eggert wrote:
 Alan Cox [EMAIL PROTECTED] wrote:
 
  I'd tried to advocate SIGDANGER some years ago as well, but none of
  the kernel maintainers were interested.  It definitely makes sense
  to have some sort of mechanism like this.  At the time I first brought
  it up it was in conjunction with Netscape using too much cache on some
  system, but it would be just as useful for all kinds of other memory-
  hungry applications.
  
  There is an early thread for a /proc file which you can add to your
  poll() set and it will wake people when memory is low. Very elegant and
  if async support is added it will also give you the signal variant for
  free.
 
 IMO you'll need a userspace daemon. The kernel does only know about the
 amount of memory available / recommended for a system (or container),
 while the user knows which program's cache is most precious today.
 
 (Off cause the userspace daemon will in turn need the /proc file.)
 
 I think a single, system-wide signal is the second-to worst solution: All
 applications (or the wrong one, if you select one) would free their caches
 and start to crawl, and either stay in this state or slowly increase their
 caches again until they get signaled again. And the signal would either
 come too early or too late. The userspace daemon could collect the weighted
 demand of memory from all applications and tell them how much to use.

I don't think that's something that would require finetuning on a
per-application basis - the kernel should tell all applications once to
reduce memory consumption and write a fat warning to the logs (which
will on well-maintained systems be mailed to the admin).

Your and tell them how much to use wouldn't work for most applications 
- e.g. I've worked the last weeks with a computer with 512 MB RAM and no 
Swap, which means usually only 200 MB of free RAM. I've gotten quite 
used to git aborting with fatal: Out of memory, malloc failed when 
200 MB weren't enough for git, and I don't think there is any reasonable 
way for git to reduce the memory usage while continuing to run.

In practice, there is a small number of programs that are both the
common memory hogs and should be able to reduce their memory consumption
by 10% or 20% without big problems when requested (e.g. Java VMs,
Firefox and databases come into my mind).

And from a performance point of view letting applications voluntarily 
free some memory is better even than starting to swap.

cu
Adrian

-- 

   Is there not promise of rain? Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   Only a promise, Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC] Parallelize IO for e2fsck

2008-01-24 Thread Theodore Tso

On Fri, Jan 25, 2008 at 01:08:09AM +0200, Adrian Bunk wrote:
 In practice, there is a small number of programs that are both the
 common memory hogs and should be able to reduce their memory consumption
 by 10% or 20% without big problems when requested (e.g. Java VMs,
 Firefox and databases come into my mind).

I agree, it's only a few processes where this makes sense.  But for
those that do, it would be useful if they could register with the
kernel that would like to know, (just before the system starts
ejecting cached data, just before swapping, etc.) and at what
frequency.  And presumably, if the kernel notices that a process is
responding to such requests with memory actually getting released back
to the system, that process could get rewarded by having the OOM
killer less likely to target that particular thread.

AIX basically did this with SIGDANGER (the signal is ignored by
default), except there wasn't the ability for the process to tell the
kernel at what level of memory pressure before it should start getting
notified, and there was no way for the kernel to tell how bad the
memory pressure actually was.  On the other hand, it was a relatively
simple design.

In practice very few processes would indeed pay attention to
SIGDANGER, so I think you're quite right there.

 And from a performance point of view letting applications voluntarily 
 free some memory is better even than starting to swap.

Absolutely.

- Ted
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [patch 01/26] mount options: add documentation

2008-01-24 Thread Erez Zadok

In message [EMAIL PROTECTED], Miklos Szeredi writes:
From: Miklos Szeredi [EMAIL PROTECTED]

This series addresses the problem of showing mount options in
/proc/mounts.

Several filesystems which use mount options, have not implemented a
.show_options superblock operation. Several others have implemented
this callback, but have not kept it fully up to date with the parsed
options.
[...]

The following filesystems still need fixing: CIFS, NFS, XFS, Unionfs,
Reiser4. For CIFS, NFS and XFS I wasn't able to understand how some
of the options are used. The last two are not yet in mainline, so I
leave fixing those to their respective maintainers out of pure
laziness.

Table displaying status of all in-kernel filesystems:
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
legend:

none - fs has options, but doesn't define -show_options()
some - fs defines -show_options(), but some only options are shown
most - fs defines -show_options(), and shows most of them
good - fs shows all options
noopt - fs does not have options
patch - a patch will be posted
[...]

in -mm:

reiser4 some
unionfs none

Hi Miklos,

Where did you check for the existence of a -show_options method for
unionfs? Unionfs does implement -show_options and supports all of the
mount/remount options. See:

http://git.kernel.org/?p=linux/kernel/git/ezk/unionfs.git;a=blob;f=fs/unionfs/super.c;h=986c980261a5b171147d66ac05bf08423e2fd6b6;hb=HEAD#l963

The unionfs -remount code supports branch-management options which can
add/del/change a branch, but we don't show those directly in -show_options;
it makes more sense to show the final (and thus most current) branch
configuration.

Could you update your records please?

BTW, I should be able to use your save_mount_options().

Cheers,
Erez.
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [RFC] Parallelize IO for e2fsck

2008-01-24 Thread Zan Lynx


On Thu, 2008-01-24 at 18:40 -0500, Theodore Tso wrote:
 On Fri, Jan 25, 2008 at 01:08:09AM +0200, Adrian Bunk wrote:
  In practice, there is a small number of programs that are both the
  common memory hogs and should be able to reduce their memory consumption
  by 10% or 20% without big problems when requested (e.g. Java VMs,
  Firefox and databases come into my mind).
 
 I agree, it's only a few processes where this makes sense.  But for
 those that do, it would be useful if they could register with the
 kernel that would like to know, (just before the system starts
 ejecting cached data, just before swapping, etc.) and at what
 frequency.  And presumably, if the kernel notices that a process is
 responding to such requests with memory actually getting released back
 to the system, that process could get rewarded by having the OOM
 killer less likely to target that particular thread.

Have y'all been following the /dev/mem_notify patches?
http://article.gmane.org/gmane.linux.kernel/628653

-- 
Zan Lynx [EMAIL PROTECTED]


signature.asc
Description: This is a digitally signed message part

Re: [RFC] ext3: per-process soft-syncing data=ordered mode

2008-01-24 Thread Chris Snook


Al Boldi wrote:

Greetings!

data=ordered mode has proven reliable over the years, and it does this by 
ordering filedata flushes before metadata flushes.  But this sometimes 
causes contention in the order of a 10x slowdown for certain apps, either 
due to the misuse of fsync or due to inherent behaviour like db's, as well 
as inherent starvation issues exposed by the data=ordered mode.


data=writeback mode alleviates data=order mode slowdowns, but only works 
per-mount and is too dangerous to run as a default mode.


This RFC proposes to introduce a tunable which allows to disable fsync and 
changes ordered into writeback writeout on a per-process basis like this:


  echo 1  /proc/`pidof process`/softsync


Your comments are much welcome!


This is basically a kernel workaround for stupid app behavior.  It wouldn't be 
the first time we've provided such an option, but we shouldn't do it without a 
very good justification.  At the very least, we need a test case that 
demonstrates the problem and benchmark results that prove that this approach 
actually fixes it.  I suspect we can find a cleaner fix for the problem.


-- Chris
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: possible deadlock shown by CONFIG_PROVE_LOCKING

2008-01-24 Thread Lachlan McIlroy


Carlos Carvalho wrote:

I compiled the kernel with Ingo's CONFIG_PROVE_LOCKING and got the
below at boot. Is it a problem?

It was a problem - it has since been fixed in 2.6.23.
Patch is attached in case you're interested.



Lock dependency validator: Copyright (c) 2006 Red Hat, Inc., Ingo Molnar
... MAX_LOCKDEP_SUBCLASSES:8
... MAX_LOCK_DEPTH:  30
... MAX_LOCKDEP_KEYS:2048
... CLASSHASH_SIZE:   1024
... MAX_LOCKDEP_ENTRIES: 8192
... MAX_LOCKDEP_CHAINS:  16384
... CHAINHASH_SIZE:  8192
 memory used by lock dependency info: 1648 kB
 per task-struct memory footprint: 1680 bytes

| Locking API testsuite:

[removed]
---
Good, all 218 testcases passed! |
-

Further down

md: running: sdah1sdag1
raid1: raid set md3 active with 2 out of 2 mirrors
md: ... autorun DONE.
Filesystem md1: Disabling barriers, not supported by the underlying device
XFS mounting filesystem md1
Ending clean XFS mount for filesystem: md1
VFS: Mounted root (xfs filesystem).
Freeing unused kernel memory: 284k freed
Warning: unable to open an initial console.
Filesystem md1: Disabling barriers, not supported by the underlying device

===
[ INFO: possible circular locking dependency detected ]
2.6.22.16 #1
---
mount/1558 is trying to acquire lock:
 ((ip-i_lock)-mr_lock/1){--..}, at: [80312805] xfs_ilock+0x63/0x8d

but task is already holding lock:
 ((ip-i_lock)-mr_lock){}, at: [80312805] xfs_ilock+0x63/0x8d

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

- #1 ((ip-i_lock)-mr_lock){}:
   [80249fa6] __lock_acquire+0xa0f/0xb9f
   [8024a50d] lock_acquire+0x48/0x63
   [80312805] xfs_ilock+0x63/0x8d
   [8023c909] down_write_nested+0x38/0x46
   [80312805] xfs_ilock+0x63/0x8d
   [803132e8] xfs_iget_core+0x3ef/0x705
   [803136a2] xfs_iget+0xa4/0x14e
   [80328364] xfs_trans_iget+0xb4/0x128
   [80316a57] xfs_ialloc+0x9b/0x4b7
   [80249fc9] __lock_acquire+0xa32/0xb9f
   [80328d87] xfs_dir_ialloc+0x84/0x2cd
   [80312805] xfs_ilock+0x63/0x8d
   [8023c909] down_write_nested+0x38/0x46
   [8032e307] xfs_create+0x331/0x65f
   [80308163] xfs_dir2_leaf_lookup+0x1d/0x96
   [80338367] xfs_vn_mknod+0x12f/0x1f2
   [8027fb0a] vfs_create+0x6e/0x9e
   [80282af3] open_namei+0x1f7/0x6a9
   [8021843d] do_page_fault+0x438/0x78f
   [8027705a] do_filp_open+0x1c/0x3d
   [8045bf56] _spin_unlock+0x17/0x20
   [80276e3d] get_unused_fd+0x11c/0x12a
   [802770bb] do_sys_open+0x40/0x7b
   [802095be] system_call+0x7e/0x83
   [] 0x

- #0 ((ip-i_lock)-mr_lock/1){--..}:
   [80248896] print_circular_bug_header+0xcc/0xd3
   [80249ea2] __lock_acquire+0x90b/0xb9f
   [8024a50d] lock_acquire+0x48/0x63
   [80312805] xfs_ilock+0x63/0x8d
   [8023c909] down_write_nested+0x38/0x46
   [80312805] xfs_ilock+0x63/0x8d
   [8032bd30] xfs_lock_inodes+0x152/0x16d
   [8032e807] xfs_link+0x1d2/0x3f7
   [80249f3f] __lock_acquire+0x9a8/0xb9f
   [80337fe5] xfs_vn_link+0x3c/0x91
   [80248f4a] mark_held_locks+0x58/0x72
   [8045a9b7] __mutex_lock_slowpath+0x250/0x266
   [80249119] trace_hardirqs_on+0x115/0x139
   [8045a9c2] __mutex_lock_slowpath+0x25b/0x266
   [8027f88b] vfs_link+0xe8/0x124
   [802822d8] sys_linkat+0xcd/0x129
   [8045baaf] trace_hardirqs_on_thunk+0x35/0x37
   [80249119] trace_hardirqs_on+0x115/0x139
   [8045baaf] trace_hardirqs_on_thunk+0x35/0x37
   [802095be] system_call+0x7e/0x83
   [] 0x

other info that might help us debug this:

3 locks held by mount/1558:
 #0:  (inode-i_mutex/1){--..}, at: [802800f5] lookup_create+0x23/0x8
5
 #1:  (inode-i_mutex){--..}, at: [8027f878] vfs_link+0xd5/0x124
 #2:  ((ip-i_lock)-mr_lock){}, at: [80312805] xfs_ilock+0x63/0
x8d

stack backtrace:

Call Trace:
 [80248612] print_circular_bug_tail+0x69/0x72
 [80248896] print_circular_bug_header+0xcc/0xd3
 [80249ea2] __lock_acquire+0x90b/0xb9f
 [8024a50d] lock_acquire+0x48/0x63
 [80312805] xfs_ilock+0x63/0x8d
 [8023c909] down_write_nested+0x38/0x46
 [80312805] xfs_ilock+0x63/0x8d
 [8032bd30] xfs_lock_inodes+0x152/0x16d
 [8032e807]

Re: [patch 06/26] mount options: fix autofs4

2008-01-24 Thread Ian Kent


On Thu, 2008-01-24 at 20:33 +0100, Miklos Szeredi wrote:
 plain text document attachment (autofs4_opts.patch)
 From: Miklos Szeredi [EMAIL PROTECTED]
 
 Add uid= and gid= options to /proc/mounts for autofs4 filesystems.

Apologies, I did say I would do this but have been quite busy.

 
 Signed-off-by: Miklos Szeredi [EMAIL PROTECTED]
Acked-by Ian Kent [EMAIL PROTECTED]

I haven't tested this yet but it is fairly straight forward.
I will check it out as soon as I get back to some work that I'm doing on
autofs4 (next few days).

 ---
 
 Index: linux/fs/autofs4/inode.c
 ===
 --- linux.orig/fs/autofs4/inode.c 2008-01-22 15:52:42.0 +0100
 +++ linux/fs/autofs4/inode.c  2008-01-22 23:36:02.0 +0100
 @@ -188,11 +188,16 @@ out_kill_sb:
  static int autofs4_show_options(struct seq_file *m, struct vfsmount *mnt)
  {
   struct autofs_sb_info *sbi = autofs4_sbi(mnt-mnt_sb);
 + struct inode *root_inode = mnt-mnt_sb-s_root-d_inode;
  
   if (!sbi)
   return 0;
  
   seq_printf(m, ,fd=%d, sbi-pipefd);
 + if (root_inode-i_uid != 0)
 + seq_printf(m, ,uid=%u, root_inode-i_uid);
 + if (root_inode-i_gid != 0)
 + seq_printf(m, ,gid=%u, root_inode-i_gid);
   seq_printf(m, ,pgrp=%d, sbi-oz_pgrp);
   seq_printf(m, ,timeout=%lu, sbi-exp_timeout/HZ);
   seq_printf(m, ,minproto=%d, sbi-min_proto);
 
 --

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

54 matches

Mail list logo