subject:"Re\: \[RFC, PATCH, RESEND\] fs\: push rcu_barrier\(\) from deactivate_locked

Re: [RFC, PATCH, RESEND] fs: push rcu_barrier() from deactivate_locked_super() to filesystems

2012-06-11 Thread Kirill A. Shutemov

On Sat, Jun 09, 2012 at 12:25:57AM -0700, Andrew Morton wrote:
 And...  it seems that I misread what's going on.  The individual
 filesystems are doing the rcu freeing of their inodes, so it is
 appropriate that they also call rcu_barrier() prior to running
 kmem_cache_free().  Which is what Kirill's patch does.  oops.

Ack? ;)

-- 
 Kirill A. Shutemov


signature.asc
Description: Digital signature

Re: [RFC, PATCH, RESEND] fs: push rcu_barrier() from deactivate_locked_super() to filesystems

2012-06-09 Thread Marco Stornelli


Il 09/06/2012 02:28, Andrew Morton ha scritto:

On Fri, 8 Jun 2012 16:46:47 -0700 Linus Torvaldstorva...@linux-foundation.org 
 wrote:


Of course, if you just mean having a VFS wrapper that does

 static void vfs_inode_kmem_cache_destroy(struct kmem_cache *cachep)
 {
 rcu_barrier();
 kmem_cache_destroy(cachep);
 }

then we could do that. Not much better than what Kirill's patch did,
but at least we could have that comment in just one single place.


That's conceptually what I meant.  But it has the problem that new and
out-of-tree filesystems might forget to do it.  Which is why I suggest
adding a kmem_cache* argument to unregister_filesystem() for this.

It's a bit awkward, and the fs can pass in NULL if it knows what it's
doing.  But it's reliable.
--


The call of rcu_barrier should be mandatory for the unload fs module 
problem, right? If the fs is compiled statically maybe we could avoid 
it, but (eventually) this kind of decision is per-fs, so this could be a 
clue that the call of rcu_barrier maybe is inside each fs not in VFS.


Marco
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC, PATCH, RESEND] fs: push rcu_barrier() from deactivate_locked_super() to filesystems

2012-06-09 Thread Andrew Morton

On Sat, 09 Jun 2012 09:06:28 +0200 Marco Stornelli marco.storne...@gmail.com 
wrote:

 Il 09/06/2012 02:28, Andrew Morton ha scritto:
  On Fri, 8 Jun 2012 16:46:47 -0700 Linus 
  Torvaldstorva...@linux-foundation.org  wrote:
 
  Of course, if you just mean having a VFS wrapper that does
 
   static void vfs_inode_kmem_cache_destroy(struct kmem_cache *cachep)
   {
   rcu_barrier();
   kmem_cache_destroy(cachep);
   }
 
  then we could do that. Not much better than what Kirill's patch did,
  but at least we could have that comment in just one single place.
 
  That's conceptually what I meant.  But it has the problem that new and
  out-of-tree filesystems might forget to do it.  Which is why I suggest
  adding a kmem_cache* argument to unregister_filesystem() for this.
 
  It's a bit awkward, and the fs can pass in NULL if it knows what it's
  doing.  But it's reliable.
  --
 
 The call of rcu_barrier should be mandatory for the unload fs module 
 problem, right? If the fs is compiled statically maybe we could avoid 
 it, but (eventually) this kind of decision is per-fs, so this could be a 
 clue that the call of rcu_barrier maybe is inside each fs not in VFS.
 

No, this is unrelated to module unloading and the problem affects
statically linked filesystems also.  The requirement is that all inodes
which are pending rcu freeing be flushed (and freed) before their cache
is destroyed in kmem_cache_destroy().

And...  it seems that I misread what's going on.  The individual
filesystems are doing the rcu freeing of their inodes, so it is
appropriate that they also call rcu_barrier() prior to running
kmem_cache_free().  Which is what Kirill's patch does.  oops.

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC, PATCH, RESEND] fs: push rcu_barrier() from deactivate_locked_super() to filesystems

2012-06-08 Thread Andrew Morton

On Sat,  9 Jun 2012 00:41:03 +0300
Kirill A. Shutemov kirill.shute...@linux.intel.com wrote:

 There's no reason to call rcu_barrier() on every deactivate_locked_super().
 We only need to make sure that all delayed rcu free inodes are flushed
 before we destroy related cache.
 
 Removing rcu_barrier() from deactivate_locked_super() affects some
 fas paths. E.g. on my machine exit_group() of a last process in IPC
 namespace takes 0.07538s. rcu_barrier() takes 0.05188s of that time.

What an unpleasant patch.  Is final-process-exiting-ipc-namespace a
sufficiently high-frequency operation to justify the change?

I don't really understand what's going on here.  Are you saying that
there is some filesystem against which we run deactivate_locked_super()
during exit_group(), and that this filesystem doesn't use rcu-freeing
of inodes?  The description needs this level of detail, please.


The implementation would be less unpleasant if we could do the
rcu_barrier() in kmem_cache_destroy().  I can't see a way of doing that
without adding a dedicated slab flag, which would require editing all
the filesystems anyway.


(kmem_cache_destroy() already has an rcu_barrier().  Can we do away
with the private rcu games in the vfs and switch to
SLAB_DESTROY_BY_RCU?)
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC, PATCH, RESEND] fs: push rcu_barrier() from deactivate_locked_super() to filesystems

2012-06-08 Thread Kirill A. Shutemov

On Fri, Jun 08, 2012 at 03:02:53PM -0700, Andrew Morton wrote:
 On Sat,  9 Jun 2012 00:41:03 +0300
 Kirill A. Shutemov kirill.shute...@linux.intel.com wrote:
 
  There's no reason to call rcu_barrier() on every deactivate_locked_super().
  We only need to make sure that all delayed rcu free inodes are flushed
  before we destroy related cache.
  
  Removing rcu_barrier() from deactivate_locked_super() affects some
  fas paths. E.g. on my machine exit_group() of a last process in IPC
  namespace takes 0.07538s. rcu_barrier() takes 0.05188s of that time.
 
 What an unpleasant patch.  Is final-process-exiting-ipc-namespace a
 sufficiently high-frequency operation to justify the change?
 
 I don't really understand what's going on here.  Are you saying that
 there is some filesystem against which we run deactivate_locked_super()
 during exit_group(), and that this filesystem doesn't use rcu-freeing
 of inodes?  The description needs this level of detail, please.

I think the rcu_barrier() is in wrong place. We need it to safely destroy
inode cache. deactivate_locked_super() is part of umount() path, but all
filesystems I've checked have inode cache for whole filesystem, not
per-mount.

 The implementation would be less unpleasant if we could do the
 rcu_barrier() in kmem_cache_destroy().  I can't see a way of doing that
 without adding a dedicated slab flag, which would require editing all
 the filesystems anyway.

I think rcu_barrier() for all kmem_cache_destroy() would be too expensive.
 
 
 (kmem_cache_destroy() already has an rcu_barrier().  Can we do away
 with the private rcu games in the vfs and switch to
 SLAB_DESTROY_BY_RCU?)

-- 
 Kirill A. Shutemov


signature.asc
Description: Digital signature

Re: [RFC, PATCH, RESEND] fs: push rcu_barrier() from deactivate_locked_super() to filesystems

2012-06-08 Thread Al Viro

On Sat, Jun 09, 2012 at 01:14:46AM +0300, Kirill A. Shutemov wrote:
  The implementation would be less unpleasant if we could do the
  rcu_barrier() in kmem_cache_destroy().  I can't see a way of doing that
  without adding a dedicated slab flag, which would require editing all
  the filesystems anyway.
 
 I think rcu_barrier() for all kmem_cache_destroy() would be too expensive.

You've got to be kidding.  Please, show us the codepath that would be hot
enough to make that too expensive and would contain kmem_cache_destroy().
Note that module unload is *not* a hot path - not on any even remotely sane
use.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC, PATCH, RESEND] fs: push rcu_barrier() from deactivate_locked_super() to filesystems

2012-06-08 Thread Andrew Morton

On Sat, 9 Jun 2012 01:14:46 +0300
Kirill A. Shutemov kirill.shute...@linux.intel.com wrote:

 On Fri, Jun 08, 2012 at 03:02:53PM -0700, Andrew Morton wrote:
  On Sat,  9 Jun 2012 00:41:03 +0300
  Kirill A. Shutemov kirill.shute...@linux.intel.com wrote:
  
   There's no reason to call rcu_barrier() on every 
   deactivate_locked_super().
   We only need to make sure that all delayed rcu free inodes are flushed
   before we destroy related cache.
   
   Removing rcu_barrier() from deactivate_locked_super() affects some
   fas paths. E.g. on my machine exit_group() of a last process in IPC
   namespace takes 0.07538s. rcu_barrier() takes 0.05188s of that time.
  
  What an unpleasant patch.  Is final-process-exiting-ipc-namespace a
  sufficiently high-frequency operation to justify the change?

This, please.

  I don't really understand what's going on here.  Are you saying that
  there is some filesystem against which we run deactivate_locked_super()
  during exit_group(), and that this filesystem doesn't use rcu-freeing
  of inodes?  The description needs this level of detail, please.

You still haven't explained where this deactivate_locked_super() call
is coming from.  Oh well.

 I think the rcu_barrier() is in wrong place. We need it to safely destroy
 inode cache. deactivate_locked_super() is part of umount() path, but all
 filesystems I've checked have inode cache for whole filesystem, not
 per-mount.

Well from a design perspective, putting the rcu_barrier() in the vfs is
the *correct* place.  Individual filesystems shouldn't be hard-coding
knowledge about vfs internal machinery.

A neater implementation might be to add a kmem_cache* argument to
unregister_filesystem().  If that is non-NULL, unregister_filesystem()
does the rcu_barrier() and destroys the cache.  That way we get to
delete (rather than add) a bunch of code from all filesystems and new
and out-of-tree filesystems cannot forget to perform the rcu_barrier().

  The implementation would be less unpleasant if we could do the
  rcu_barrier() in kmem_cache_destroy().  I can't see a way of doing that
  without adding a dedicated slab flag, which would require editing all
  the filesystems anyway.
 
 I think rcu_barrier() for all kmem_cache_destroy() would be too expensive.

That is not what I proposed.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC, PATCH, RESEND] fs: push rcu_barrier() from deactivate_locked_super() to filesystems

2012-06-08 Thread Al Viro

On Fri, Jun 08, 2012 at 03:25:50PM -0700, Andrew Morton wrote:

 A neater implementation might be to add a kmem_cache* argument to
 unregister_filesystem().  If that is non-NULL, unregister_filesystem()
 does the rcu_barrier() and destroys the cache.  That way we get to
 delete (rather than add) a bunch of code from all filesystems and new
 and out-of-tree filesystems cannot forget to perform the rcu_barrier().

There's often enough more than one cache, so that one is no-go.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC, PATCH, RESEND] fs: push rcu_barrier() from deactivate_locked_super() to filesystems

2012-06-08 Thread Linus Torvalds

On Fri, Jun 8, 2012 at 3:23 PM, Al Viro v...@zeniv.linux.org.uk wrote:

 Note that module unload is *not* a hot path - not on any even remotely sane
 use.

Actually, I think we've had distributions that basically did a load
pretty much everything, and let God sort it out approach to modules.
I know some people *have* actually worried about module load/unload
performance.

Whether it is remotely sane I'm not going to argue for, but ..

Linus
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC, PATCH, RESEND] fs: push rcu_barrier() from deactivate_locked_super() to filesystems

2012-06-08 Thread Kirill A. Shutemov

On Fri, Jun 08, 2012 at 03:25:50PM -0700, Andrew Morton wrote:
 On Sat, 9 Jun 2012 01:14:46 +0300
 Kirill A. Shutemov kirill.shute...@linux.intel.com wrote:
 
  On Fri, Jun 08, 2012 at 03:02:53PM -0700, Andrew Morton wrote:
   On Sat,  9 Jun 2012 00:41:03 +0300
   Kirill A. Shutemov kirill.shute...@linux.intel.com wrote:
   
There's no reason to call rcu_barrier() on every 
deactivate_locked_super().
We only need to make sure that all delayed rcu free inodes are flushed
before we destroy related cache.

Removing rcu_barrier() from deactivate_locked_super() affects some
fas paths. E.g. on my machine exit_group() of a last process in IPC
namespace takes 0.07538s. rcu_barrier() takes 0.05188s of that time.
   
   What an unpleasant patch.  Is final-process-exiting-ipc-namespace a
   sufficiently high-frequency operation to justify the change?
 
 This, please.

ALT Linux guys use a namespaces (including IPC namespace) to create
sandbox[1] for build system and other use cases. The build system calls
the sandboxing wrapper frequently on setup building chroot and build
prepare. This kind of delays affect timings significantly.

[1] http://git.altlinux.org/people/ldv/packages/hasher-priv.git

   I don't really understand what's going on here.  Are you saying that
   there is some filesystem against which we run deactivate_locked_super()
   during exit_group(), and that this filesystem doesn't use rcu-freeing
   of inodes?  The description needs this level of detail, please.
 
 You still haven't explained where this deactivate_locked_super() call
 is coming from.  Oh well.

Call Trace:
 [81443a2a] schedule+0x3a/0x50
 [81441e7d] schedule_timeout+0x1cd/0x2c0
 [811f8f87] ? mqueue_destroy_inode+0x17/0x20
 [81443044] wait_for_common+0xc4/0x160
 [8107af50] ? try_to_wake_up+0x2a0/0x2a0
 [810d63b0] ? call_rcu_sched+0x10/0x20
 [810d63a0] ? call_rcu_bh+0x20/0x20
 [81443188] wait_for_completion+0x18/0x20
 [810d5a9b] _rcu_barrier.clone.31+0x9b/0xb0
 [810d5ac0] rcu_barrier_sched+0x10/0x20
 [810d5ad9] rcu_barrier+0x9/0x10
 [811602c9] deactivate_locked_super+0x49/0x90
 [81160d35] deactivate_super+0x45/0x60
 [8117ad74] mntput_no_expire+0x104/0x150
 [8117addc] mntput+0x1c/0x30
 [8117cda7] kern_unmount+0x27/0x30
 [811faeb0] mq_put_mnt+0x10/0x20
 [811fb57f] put_ipc_ns+0x3f/0xb0
 [81071f5c] free_nsproxy+0x3c/0xa0
 [81072143] switch_task_namespaces+0x33/0x40
 [8107215b] exit_task_namespaces+0xb/0x10
 [8104f154] do_exit+0x4b4/0x8a0
 [8104f7e3] do_group_exit+0x53/0xc0
 [8104f862] sys_exit_group+0x12/0x20
 [8144c939] system_call_fastpath+0x16/0x1b

-- 
 Kirill A. Shutemov


signature.asc
Description: Digital signature

Re: [RFC, PATCH, RESEND] fs: push rcu_barrier() from deactivate_locked_super() to filesystems

2012-06-08 Thread Andrew Morton

On Sat, 9 Jun 2012 02:31:27 +0300
Kirill A. Shutemov kirill.shute...@linux.intel.com wrote:

 On Fri, Jun 08, 2012 at 03:31:20PM -0700, Andrew Morton wrote:
  On Fri, 8 Jun 2012 23:27:34 +0100
  Al Viro v...@zeniv.linux.org.uk wrote:
  
   On Fri, Jun 08, 2012 at 03:25:50PM -0700, Andrew Morton wrote:
   
A neater implementation might be to add a kmem_cache* argument to
unregister_filesystem().  If that is non-NULL, unregister_filesystem()
does the rcu_barrier() and destroys the cache.  That way we get to
delete (rather than add) a bunch of code from all filesystems and new
and out-of-tree filesystems cannot forget to perform the rcu_barrier().
   
   There's often enough more than one cache, so that one is no-go.
  
  kmem_cache** ;)
  
  Which filesystems have multiple inode caches?
 
 Multiple inode caches? No.
 Multiple caches with call_rcu() free? See btrfs or gfs2.

OK.  But for those non-inode caches, the rcu treatment is private to
the filesystem.  Hence it is appropriate that the filesystem call
rcu_barrier() for those caches.

But in the case of the inode caches, the rcu treatment is a vfs thing,
so it is the vfs which should perform the rcu_barrier().

This is a red herring - those non-inode caches have nothing to do with
the issue we're dicussing.


So how about open-coding the rcu_barrier() in btrfs and gfs2 for the
non-inode caches (which is the appropriate place), and hand the inode
cache over to the vfs for treatment (which is the appropriate place).

The downside is that btrfs and gfs2 will do an extra rcu_barrier() at
umount time.  Shrug.

If they really want to super-optimise that, they can skip the private
rcu_barrier() call and assume that the vfs will be doing it.  Not a
good idea, IMO.


--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC, PATCH, RESEND] fs: push rcu_barrier() from deactivate_locked_super() to filesystems

2012-06-08 Thread Linus Torvalds

On Fri, Jun 8, 2012 at 4:37 PM, Andrew Morton a...@linux-foundation.org wrote:

 So how about open-coding the rcu_barrier() in btrfs and gfs2 for the
 non-inode caches (which is the appropriate place), and hand the inode
 cache over to the vfs for treatment (which is the appropriate place).

The thing is, none of the inode caches are really up to the VFS. They
are per-filesystem caches, that just *embed* an inode as part of the
bigger ext4_inode or whatever. But apart from the fact that the
embedded inode requires them to then use the proper call_rcu() stuff
to do the delayed free, they really are pretty much filesystem data
structures. The VFS layer can neither free them or allocate them,
since the VFS layer doesn't even know how big the structures are, or
where the inodes are embedded, or how to initialize them (or even when
to allocate them).

Of course, if you just mean having a VFS wrapper that does

static void vfs_inode_kmem_cache_destroy(struct kmem_cache *cachep)
{
rcu_barrier();
kmem_cache_destroy(cachep);
}

then we could do that. Not much better than what Kirill's patch did,
but at least we could have that comment in just one single place.

Linus
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC, PATCH, RESEND] fs: push rcu_barrier() from deactivate_locked_super() to filesystems

2012-06-08 Thread Andrew Morton

On Fri, 8 Jun 2012 16:46:47 -0700 Linus Torvalds 
torva...@linux-foundation.org wrote:

 Of course, if you just mean having a VFS wrapper that does
 
 static void vfs_inode_kmem_cache_destroy(struct kmem_cache *cachep)
 {
 rcu_barrier();
 kmem_cache_destroy(cachep);
 }
 
 then we could do that. Not much better than what Kirill's patch did,
 but at least we could have that comment in just one single place.

That's conceptually what I meant.  But it has the problem that new and
out-of-tree filesystems might forget to do it.  Which is why I suggest
adding a kmem_cache* argument to unregister_filesystem() for this.

It's a bit awkward, and the fs can pass in NULL if it knows what it's
doing.  But it's reliable.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC, PATCH, RESEND] fs: push rcu_barrier() from deactivate_locked_super() to filesystems

Re: [RFC, PATCH, RESEND] fs: push rcu_barrier() from deactivate_locked_super() to filesystems

Re: [RFC, PATCH, RESEND] fs: push rcu_barrier() from deactivate_locked_super() to filesystems

Re: [RFC, PATCH, RESEND] fs: push rcu_barrier() from deactivate_locked_super() to filesystems

Re: [RFC, PATCH, RESEND] fs: push rcu_barrier() from deactivate_locked_super() to filesystems

Re: [RFC, PATCH, RESEND] fs: push rcu_barrier() from deactivate_locked_super() to filesystems

Re: [RFC, PATCH, RESEND] fs: push rcu_barrier() from deactivate_locked_super() to filesystems

Re: [RFC, PATCH, RESEND] fs: push rcu_barrier() from deactivate_locked_super() to filesystems

Re: [RFC, PATCH, RESEND] fs: push rcu_barrier() from deactivate_locked_super() to filesystems

Re: [RFC, PATCH, RESEND] fs: push rcu_barrier() from deactivate_locked_super() to filesystems

Re: [RFC, PATCH, RESEND] fs: push rcu_barrier() from deactivate_locked_super() to filesystems

Re: [RFC, PATCH, RESEND] fs: push rcu_barrier() from deactivate_locked_super() to filesystems

Re: [RFC, PATCH, RESEND] fs: push rcu_barrier() from deactivate_locked_super() to filesystems

13 matches

Site Navigation

Mail list logo

Footer information