Re: [PATCH] Ext3 online resizing locking issue

2005-08-31 Thread Stephen C. Tweedie
Hi,

On Wed, 2005-08-31 at 12:35, Glauber de Oliveira Costa wrote:

> At a first look, i thought about locking gdt-related data. But in a
> closer one, it seemed to me that we're in fact modifying a little bit
> more than that in the resize code. But all these modifications seem to
> be somehow related to the ext3 super block specific data in
> ext3_sb_info. My first naive approach would be adding a lock to that
> struct

I took great care when making that code SMP-safe to avoid such locks,
for performance reasons.  See the comments at

 * We need to protect s_groups_count against other CPUs seeing
 * inconsistent state in the superblock.

in fs/ext3/resize.c for the rules.  But basically the way it works is
that we only usually modify data that cannot be in use by other parts of
the kernel --- and that's fairly easy to guarantee, since by definition
extending the fs is something that is touching bits that aren't already
in use.  Only once all the new data is safely installed do we atomically
update the s_groups_count field, which instantly makes the new data
visible.  We enforce this ordering via smp read barriers before reading
s_groups_count and write barriers after modifying it, but we don't
actually have locks as such.

The only use of locking in the resize is hence the superblock lock,
which is not really there to protect the resize from the rest of the fs
--- the s_groups_count barriers do that.  All the sb lock is needed for
is to prevent two resizes from progressing at the same time; and that
could easily be abstracted into a separate resize lock.

Cheers,
 Stephen

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Ext3 online resizing locking issue

2005-08-31 Thread Glauber de Oliveira Costa
> 
> The two different uses of the superblock lock are really quite
> different; I don't see any particular problem with using two different
> locks for the two different things.  Mount and the namespace code are
> not locking the same thing --- the fact that the resize code uses the
> superblock lock is really a historical side-effect of the fact that we
> used to use the same overloaded superblock lock in the ext2/ext3 block
> allocation layers to guard bitmap access.
> 
> 
At a first look, i thought about locking gdt-related data. But in a
closer one, it seemed to me that we're in fact modifying a little bit
more than that in the resize code. But all these modifications seem to
be somehow related to the ext3 super block specific data in
ext3_sb_info. My first naive approach would be adding a lock to that
struct

Besides that, by doing that, we become pretty much independent of vfs
locking decisions to handle ext3 data. Do you think it all make sense?



-- 
=
Glauber de Oliveira Costa
IBM Linux Technology Center - Brazil
[EMAIL PROTECTED]
=
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Ext3 online resizing locking issue

2005-08-31 Thread Glauber de Oliveira Costa
 
 The two different uses of the superblock lock are really quite
 different; I don't see any particular problem with using two different
 locks for the two different things.  Mount and the namespace code are
 not locking the same thing --- the fact that the resize code uses the
 superblock lock is really a historical side-effect of the fact that we
 used to use the same overloaded superblock lock in the ext2/ext3 block
 allocation layers to guard bitmap access.
 
 
At a first look, i thought about locking gdt-related data. But in a
closer one, it seemed to me that we're in fact modifying a little bit
more than that in the resize code. But all these modifications seem to
be somehow related to the ext3 super block specific data in
ext3_sb_info. My first naive approach would be adding a lock to that
struct

Besides that, by doing that, we become pretty much independent of vfs
locking decisions to handle ext3 data. Do you think it all make sense?



-- 
=
Glauber de Oliveira Costa
IBM Linux Technology Center - Brazil
[EMAIL PROTECTED]
=
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Ext3 online resizing locking issue

2005-08-31 Thread Stephen C. Tweedie
Hi,

On Wed, 2005-08-31 at 12:35, Glauber de Oliveira Costa wrote:

 At a first look, i thought about locking gdt-related data. But in a
 closer one, it seemed to me that we're in fact modifying a little bit
 more than that in the resize code. But all these modifications seem to
 be somehow related to the ext3 super block specific data in
 ext3_sb_info. My first naive approach would be adding a lock to that
 struct

I took great care when making that code SMP-safe to avoid such locks,
for performance reasons.  See the comments at

 * We need to protect s_groups_count against other CPUs seeing
 * inconsistent state in the superblock.

in fs/ext3/resize.c for the rules.  But basically the way it works is
that we only usually modify data that cannot be in use by other parts of
the kernel --- and that's fairly easy to guarantee, since by definition
extending the fs is something that is touching bits that aren't already
in use.  Only once all the new data is safely installed do we atomically
update the s_groups_count field, which instantly makes the new data
visible.  We enforce this ordering via smp read barriers before reading
s_groups_count and write barriers after modifying it, but we don't
actually have locks as such.

The only use of locking in the resize is hence the superblock lock,
which is not really there to protect the resize from the rest of the fs
--- the s_groups_count barriers do that.  All the sb lock is needed for
is to prevent two resizes from progressing at the same time; and that
could easily be abstracted into a separate resize lock.

Cheers,
 Stephen

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Ext3 online resizing locking issue

2005-08-30 Thread Stephen C. Tweedie
Hi,

On Thu, 2005-08-25 at 21:43, Glauber de Oliveira Costa wrote:

> Just a question here. With s_lock held by the remount code, we're
> altering the struct super_block, and believing we're safe. We try to
> acquire it inside the resize functions, because we're trying to modify 
> this same data. Thus, if we rely on another lock, aren't we probably 
> messing  up something ?

The two different uses of the superblock lock are really quite
different; I don't see any particular problem with using two different
locks for the two different things.  Mount and the namespace code are
not locking the same thing --- the fact that the resize code uses the
superblock lock is really a historical side-effect of the fact that we
used to use the same overloaded superblock lock in the ext2/ext3 block
allocation layers to guard bitmap access.

--Stephen

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Ext3 online resizing locking issue

2005-08-30 Thread Stephen C. Tweedie
Hi,

On Thu, 2005-08-25 at 21:43, Glauber de Oliveira Costa wrote:

 Just a question here. With s_lock held by the remount code, we're
 altering the struct super_block, and believing we're safe. We try to
 acquire it inside the resize functions, because we're trying to modify 
 this same data. Thus, if we rely on another lock, aren't we probably 
 messing  up something ?

The two different uses of the superblock lock are really quite
different; I don't see any particular problem with using two different
locks for the two different things.  Mount and the namespace code are
not locking the same thing --- the fact that the resize code uses the
superblock lock is really a historical side-effect of the fact that we
used to use the same overloaded superblock lock in the ext2/ext3 block
allocation layers to guard bitmap access.

--Stephen

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Ext3 online resizing locking issue

2005-08-25 Thread Glauber de Oliveira Costa
 
> NAK, this is wrong:
> 
> > +   lock_super(sb);
> > err = ext3_group_extend(sb, EXT3_SB(sb)->s_es, n_blocks_count);
> > +   unlock_super(sb);
> 
> This basically reverses the order of locking between lock_super() and
> journal_start() (the latter acts like a lock because it can block on a
> resource if the journal is too full for the new transaction.)  That's
> the opposite order to normal, and will result in a potential deadlock.
> 
Ooops! Missed that. But I agree with the point. 

 
> But the _right_ fix, if you really want to keep that code, is probably
> to move all the resize locking to a separate lock that ranks outside the
> journal_start.  The easy workaround is to drop the superblock lock and
> reaquire it around the journal_start(); it would be pretty easy to make
> that work robustly as far as ext3 is concerned, but I suspect there may
> be VFS-layer problems if we start dropping the superblock lock in the
> middle of the s_ops->remount() call --- Al?
> 

Just a question here. With s_lock held by the remount code, we're
altering the struct super_block, and believing we're safe. We try to
acquire it inside the resize functions, because we're trying to modify 
this same data. Thus, if we rely on another lock, aren't we probably 
messing  up something ? (for example, both group_extend and remount code 
potentially modify s_flags field. If we ioctl and remount at the same time, 
each one with a different lock, something could go wrong). Am I missing
something here ? 

glauber
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Ext3 online resizing locking issue

2005-08-25 Thread Stephen C. Tweedie
Hi,

On Wed, 2005-08-24 at 22:03, Glauber de Oliveira Costa wrote:

> This simple patch provides a fix for a locking issue found in the online
> resizing code. The problem actually happened while trying to resize the
> filesystem trough the resize=xxx option in a remount. 

NAK, this is wrong:

> + lock_super(sb);
>   err = ext3_group_extend(sb, EXT3_SB(sb)->s_es, n_blocks_count);
> + unlock_super(sb);

This basically reverses the order of locking between lock_super() and
journal_start() (the latter acts like a lock because it can block on a
resource if the journal is too full for the new transaction.)  That's
the opposite order to normal, and will result in a potential deadlock.

> + {Opt_resize, "resize=%u"},
>   {Opt_err, NULL},
> - {Opt_resize, "resize"},

Right, that's disabled for now.  I guess the easy fix here is just to
remove the code entirely, given that we have locking problems with
trying to fix it!

But the _right_ fix, if you really want to keep that code, is probably
to move all the resize locking to a separate lock that ranks outside the
journal_start.  The easy workaround is to drop the superblock lock and
reaquire it around the journal_start(); it would be pretty easy to make
that work robustly as far as ext3 is concerned, but I suspect there may
be VFS-layer problems if we start dropping the superblock lock in the
middle of the s_ops->remount() call --- Al?

--Stephen

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Ext3 online resizing locking issue

2005-08-25 Thread Stephen C. Tweedie
Hi,

On Wed, 2005-08-24 at 22:03, Glauber de Oliveira Costa wrote:

 This simple patch provides a fix for a locking issue found in the online
 resizing code. The problem actually happened while trying to resize the
 filesystem trough the resize=xxx option in a remount. 

NAK, this is wrong:

 + lock_super(sb);
   err = ext3_group_extend(sb, EXT3_SB(sb)-s_es, n_blocks_count);
 + unlock_super(sb);

This basically reverses the order of locking between lock_super() and
journal_start() (the latter acts like a lock because it can block on a
resource if the journal is too full for the new transaction.)  That's
the opposite order to normal, and will result in a potential deadlock.

 + {Opt_resize, resize=%u},
   {Opt_err, NULL},
 - {Opt_resize, resize},

Right, that's disabled for now.  I guess the easy fix here is just to
remove the code entirely, given that we have locking problems with
trying to fix it!

But the _right_ fix, if you really want to keep that code, is probably
to move all the resize locking to a separate lock that ranks outside the
journal_start.  The easy workaround is to drop the superblock lock and
reaquire it around the journal_start(); it would be pretty easy to make
that work robustly as far as ext3 is concerned, but I suspect there may
be VFS-layer problems if we start dropping the superblock lock in the
middle of the s_ops-remount() call --- Al?

--Stephen

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Ext3 online resizing locking issue

2005-08-25 Thread Glauber de Oliveira Costa
 
 NAK, this is wrong:
 
  +   lock_super(sb);
  err = ext3_group_extend(sb, EXT3_SB(sb)-s_es, n_blocks_count);
  +   unlock_super(sb);
 
 This basically reverses the order of locking between lock_super() and
 journal_start() (the latter acts like a lock because it can block on a
 resource if the journal is too full for the new transaction.)  That's
 the opposite order to normal, and will result in a potential deadlock.
 
Ooops! Missed that. But I agree with the point. 

 
 But the _right_ fix, if you really want to keep that code, is probably
 to move all the resize locking to a separate lock that ranks outside the
 journal_start.  The easy workaround is to drop the superblock lock and
 reaquire it around the journal_start(); it would be pretty easy to make
 that work robustly as far as ext3 is concerned, but I suspect there may
 be VFS-layer problems if we start dropping the superblock lock in the
 middle of the s_ops-remount() call --- Al?
 

Just a question here. With s_lock held by the remount code, we're
altering the struct super_block, and believing we're safe. We try to
acquire it inside the resize functions, because we're trying to modify 
this same data. Thus, if we rely on another lock, aren't we probably 
messing  up something ? (for example, both group_extend and remount code 
potentially modify s_flags field. If we ioctl and remount at the same time, 
each one with a different lock, something could go wrong). Am I missing
something here ? 

glauber
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] Ext3 online resizing locking issue

2005-08-24 Thread Glauber de Oliveira Costa
This simple patch provides a fix for a locking issue found in the online
resizing code. The problem actually happened while trying to resize the
filesystem trough the resize=xxx option in a remount. 

Signed-off-by: Glauber de Oliveira Costa <[EMAIL PROTECTED]>


diff -up linux-2.6.13-rc6-orig/fs/ext3/ioctl.c linux/fs/ext3/ioctl.c
--- linux-2.6.13-rc6-orig/fs/ext3/ioctl.c   2005-08-24 17:48:22.0 
-0300
+++ linux/fs/ext3/ioctl.c   2005-08-24 15:12:48.0 -0300
@@ -206,7 +206,9 @@ flags_err:
if (get_user(n_blocks_count, (__u32 __user *)arg))
return -EFAULT;
 
+   lock_super(sb);
err = ext3_group_extend(sb, EXT3_SB(sb)->s_es, n_blocks_count);
+   unlock_super(sb);
journal_lock_updates(EXT3_SB(sb)->s_journal);
journal_flush(EXT3_SB(sb)->s_journal);
journal_unlock_updates(EXT3_SB(sb)->s_journal);
Only in linux/fs/ext3: patch-mnt_resize
diff -up linux-2.6.13-rc6-orig/fs/ext3/resize.c linux/fs/ext3/resize.c
--- linux-2.6.13-rc6-orig/fs/ext3/resize.c  2005-08-24 17:48:22.0 
-0300
+++ linux/fs/ext3/resize.c  2005-08-24 15:15:28.0 -0300
@@ -884,7 +884,9 @@ exit_put:
 /* Extend the filesystem to the new number of blocks specified.  This entry
  * point is only used to extend the current filesystem to the end of the last
  * existing group.  It can be accessed via ioctl, or by "remount,resize="
- * for emergencies (because it has no dependencies on reserved blocks).
+ * for emergencies (because it has no dependencies on reserved blocks). 
+ * 
+ * It should be called with sb->s_lock held
  *
  * If we _really_ wanted, we could use default values to call ext3_group_add()
  * allow the "remount" trick to work for arbitrary resizing, assuming enough
@@ -959,7 +961,6 @@ int ext3_group_extend(struct super_block
goto exit_put;
}
 
-   lock_super(sb);
if (o_blocks_count != le32_to_cpu(es->s_blocks_count)) {
ext3_warning(sb, __FUNCTION__,
 "multiple resizers run on filesystem!\n");
@@ -978,7 +979,6 @@ int ext3_group_extend(struct super_block
es->s_blocks_count = cpu_to_le32(o_blocks_count + add);
ext3_journal_dirty_metadata(handle, EXT3_SB(sb)->s_sbh);
sb->s_dirt = 1;
-   unlock_super(sb);
ext3_debug("freeing blocks %ld through %ld\n", o_blocks_count,
   o_blocks_count + add);
ext3_free_blocks_sb(handle, sb, o_blocks_count, add, _blocks);
diff -up linux-2.6.13-rc6-orig/fs/ext3/super.c linux/fs/ext3/super.c
--- linux-2.6.13-rc6-orig/fs/ext3/super.c   2005-08-24 17:48:22.0 
-0300
+++ linux/fs/ext3/super.c   2005-08-24 15:13:16.0 -0300
@@ -639,8 +639,8 @@ static match_table_t tokens = {
{Opt_quota, "quota"},
{Opt_quota, "usrquota"},
{Opt_barrier, "barrier=%u"},
+   {Opt_resize, "resize=%u"},
{Opt_err, NULL},
-   {Opt_resize, "resize"},
 };
 
 static unsigned long get_sb_block(void **data)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] Ext3 online resizing locking issue

2005-08-24 Thread Glauber de Oliveira Costa
This simple patch provides a fix for a locking issue found in the online
resizing code. The problem actually happened while trying to resize the
filesystem trough the resize=xxx option in a remount. 

Signed-off-by: Glauber de Oliveira Costa [EMAIL PROTECTED]


diff -up linux-2.6.13-rc6-orig/fs/ext3/ioctl.c linux/fs/ext3/ioctl.c
--- linux-2.6.13-rc6-orig/fs/ext3/ioctl.c   2005-08-24 17:48:22.0 
-0300
+++ linux/fs/ext3/ioctl.c   2005-08-24 15:12:48.0 -0300
@@ -206,7 +206,9 @@ flags_err:
if (get_user(n_blocks_count, (__u32 __user *)arg))
return -EFAULT;
 
+   lock_super(sb);
err = ext3_group_extend(sb, EXT3_SB(sb)-s_es, n_blocks_count);
+   unlock_super(sb);
journal_lock_updates(EXT3_SB(sb)-s_journal);
journal_flush(EXT3_SB(sb)-s_journal);
journal_unlock_updates(EXT3_SB(sb)-s_journal);
Only in linux/fs/ext3: patch-mnt_resize
diff -up linux-2.6.13-rc6-orig/fs/ext3/resize.c linux/fs/ext3/resize.c
--- linux-2.6.13-rc6-orig/fs/ext3/resize.c  2005-08-24 17:48:22.0 
-0300
+++ linux/fs/ext3/resize.c  2005-08-24 15:15:28.0 -0300
@@ -884,7 +884,9 @@ exit_put:
 /* Extend the filesystem to the new number of blocks specified.  This entry
  * point is only used to extend the current filesystem to the end of the last
  * existing group.  It can be accessed via ioctl, or by remount,resize=size
- * for emergencies (because it has no dependencies on reserved blocks).
+ * for emergencies (because it has no dependencies on reserved blocks). 
+ * 
+ * It should be called with sb-s_lock held
  *
  * If we _really_ wanted, we could use default values to call ext3_group_add()
  * allow the remount trick to work for arbitrary resizing, assuming enough
@@ -959,7 +961,6 @@ int ext3_group_extend(struct super_block
goto exit_put;
}
 
-   lock_super(sb);
if (o_blocks_count != le32_to_cpu(es-s_blocks_count)) {
ext3_warning(sb, __FUNCTION__,
 multiple resizers run on filesystem!\n);
@@ -978,7 +979,6 @@ int ext3_group_extend(struct super_block
es-s_blocks_count = cpu_to_le32(o_blocks_count + add);
ext3_journal_dirty_metadata(handle, EXT3_SB(sb)-s_sbh);
sb-s_dirt = 1;
-   unlock_super(sb);
ext3_debug(freeing blocks %ld through %ld\n, o_blocks_count,
   o_blocks_count + add);
ext3_free_blocks_sb(handle, sb, o_blocks_count, add, freed_blocks);
diff -up linux-2.6.13-rc6-orig/fs/ext3/super.c linux/fs/ext3/super.c
--- linux-2.6.13-rc6-orig/fs/ext3/super.c   2005-08-24 17:48:22.0 
-0300
+++ linux/fs/ext3/super.c   2005-08-24 15:13:16.0 -0300
@@ -639,8 +639,8 @@ static match_table_t tokens = {
{Opt_quota, quota},
{Opt_quota, usrquota},
{Opt_barrier, barrier=%u},
+   {Opt_resize, resize=%u},
{Opt_err, NULL},
-   {Opt_resize, resize},
 };
 
 static unsigned long get_sb_block(void **data)
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/