Re: CVS commit: src/sys/ufs/ufs

2009-09-22 Thread Manuel Bouyer
On Sun, Sep 20, 2009 at 08:23:38PM +0300, Antti Kantee wrote:
  In ufs_ihashget(), vget() can return a vnode that has been vclean'ed because
  vget() can sleep. After vget returns, check that vp is still connected with
  ip, and that ip still points to the inode we want. This fix the NULL
  pointer dereference in ufs_fhtovp() I've been seeing on a NFS server.
 
 Um, hold the phone.  The whole point of vget() is to provide race-free
 access to the weak vnode reference held by the file system.  Are you
 saying this does not hold anymore?

It depends on what you mean with race-free. If you mean that the
vnode returned by vget() can't be recygled, I think this is true.
If you mean that vget() can't return a clean vnode then this is false:
vget() can sleep in vn_lock(), and it releases the v_interlock mutex before
sleeping. While sleeping vclean() can VOP_RECLAIM() the vnode, even
if v_usecount is  1.

-- 
Manuel Bouyer bou...@antioche.eu.org
 NetBSD: 26 ans d'experience feront toujours la difference
--


Re: CVS commit: src/sys/ufs/ufs

2009-09-22 Thread Antti Kantee
On Tue Sep 22 2009 at 20:06:40 +0200, Manuel Bouyer wrote:
 On Sun, Sep 20, 2009 at 08:23:38PM +0300, Antti Kantee wrote:
   In ufs_ihashget(), vget() can return a vnode that has been vclean'ed 
   because
   vget() can sleep. After vget returns, check that vp is still connected 
   with
   ip, and that ip still points to the inode we want. This fix the NULL
   pointer dereference in ufs_fhtovp() I've been seeing on a NFS server.
  
  Um, hold the phone.  The whole point of vget() is to provide race-free
  access to the weak vnode reference held by the file system.  Are you
  saying this does not hold anymore?
 
 It depends on what you mean with race-free. If you mean that the
 vnode returned by vget() can't be recygled, I think this is true.
 If you mean that vget() can't return a clean vnode then this is false:
 vget() can sleep in vn_lock(), and it releases the v_interlock mutex before
 sleeping. While sleeping vclean() can VOP_RECLAIM() the vnode, even
 if v_usecount is  1.

What is the practical difference of cleaned and recycled for the
file system driver?

If there is a race in vfs and XLOCK is not used properly, I think that
should be investigated and fixed instead of patching file systems here
and there.


Re: CVS commit: src/sys/ufs/ufs

2009-09-22 Thread Manuel Bouyer
On Tue, Sep 22, 2009 at 09:23:05PM +0300, Antti Kantee wrote:
 On Tue Sep 22 2009 at 20:06:40 +0200, Manuel Bouyer wrote:
  On Sun, Sep 20, 2009 at 08:23:38PM +0300, Antti Kantee wrote:
In ufs_ihashget(), vget() can return a vnode that has been vclean'ed 
because
vget() can sleep. After vget returns, check that vp is still connected 
with
ip, and that ip still points to the inode we want. This fix the NULL
pointer dereference in ufs_fhtovp() I've been seeing on a NFS server.
   
   Um, hold the phone.  The whole point of vget() is to provide race-free
   access to the weak vnode reference held by the file system.  Are you
   saying this does not hold anymore?
  
  It depends on what you mean with race-free. If you mean that the
  vnode returned by vget() can't be recygled, I think this is true.
  If you mean that vget() can't return a clean vnode then this is false:
  vget() can sleep in vn_lock(), and it releases the v_interlock mutex before
  sleeping. While sleeping vclean() can VOP_RECLAIM() the vnode, even
  if v_usecount is  1.
 
 What is the practical difference of cleaned and recycled for the
 file system driver?

From what I understand the vnode is not on the free list yet so it can't
be reused for something else.

 
 If there is a race in vfs and XLOCK is not used properly, I think that
 should be investigated and fixed instead of patching file systems here
 and there.

I don't know how XLOCK is supposed to be used (and I even less know how
to change things without creating deadlocks). As I see it XLOCK is set
while the vnode is being cleaned, not before or after cleaning it, so
XLOCK is not going to avoid this race anyway.

I asked for help on tech-kern about this but didn't get any reply ...

-- 
Manuel Bouyer bou...@antioche.eu.org
 NetBSD: 26 ans d'experience feront toujours la difference
--


Re: CVS commit: src/sys/ufs/ufs

2009-09-22 Thread Antti Kantee
On Tue Sep 22 2009 at 21:04:14 +0200, Manuel Bouyer wrote:
 that's not an issue with the reference count. It's an issue with vclean()
 calling VOP_RECLAIM() even if the refcount is greater than 1, and
 vrelel() calling vclean() even if the refcount is greater than 1,
 when the file has been unlink(2)ed. There's a comment about this in
 vrelel(), look for variable recycle.

Ah, ic, probably would've been easier if I read the PR first ;)

So basically someone can vget the vnode (via fhtovp, since it's gone
from the directory namespace) between VOP_INACTIVE and clean.

Your fix doesn't really fix this problem, since depending on timings
the inode might still be recycled after you check it's valid.

Hmm, ok, so things which might happen:

1: VOP_REMOVE, vnode is locked, vrele is called
2: fhtovp before inode is reclaimed, blocks on vn_lock
1: VOP_INACTIVE releases vnlock, returns signal to recycle vnode
2: gets lock, does check that it's still the same inode
1: reclaims vnode
2: boom

Since vget takes the interlock, a dirty  cheap trick might be to check
that the reference count is still one before trying to clean the vnode in
vrelel(), otherwise punting and letting the next release-to-0 reclaim it?

Blah, I didn't even want to think about this migrane-inducer now.
Maybe people who have recently worked on vnode reclaiming could instead
be the ones to comment?


Re: CVS commit: src/sys/ufs/ufs

2009-09-22 Thread Manuel Bouyer
On Tue, Sep 22, 2009 at 10:42:59PM +0300, Antti Kantee wrote:
 On Tue Sep 22 2009 at 21:04:14 +0200, Manuel Bouyer wrote:
  that's not an issue with the reference count. It's an issue with vclean()
  calling VOP_RECLAIM() even if the refcount is greater than 1, and
  vrelel() calling vclean() even if the refcount is greater than 1,
  when the file has been unlink(2)ed. There's a comment about this in
  vrelel(), look for variable recycle.
 
 Ah, ic, probably would've been easier if I read the PR first ;)
 
 So basically someone can vget the vnode (via fhtovp, since it's gone
 from the directory namespace) between VOP_INACTIVE and clean.
 
 Your fix doesn't really fix this problem, since depending on timings
 the inode might still be recycled after you check it's valid.

I don't think so because at this point the vnode is locked. I think the
race window is between vn_lock() releasing the interlock and getting the
vn_lock. After that the reference count is high enouth to prevent
vrelel() trying to clean it (vtryrele() at the start of vrelel will
make it return, and we hold the interlock at this point).


 
 Hmm, ok, so things which might happen:
 
 1: VOP_REMOVE, vnode is locked, vrele is called
 2: fhtovp before inode is reclaimed, blocks on vn_lock

It would fist block on the interlock at this point, I guess.

 1: VOP_INACTIVE releases vnlock, returns signal to recycle vnode
 2: gets lock, does check that it's still the same inode
 1: reclaims vnode
 2: boom
 
 Since vget takes the interlock, a dirty  cheap trick might be to check
 that the reference count is still one before trying to clean the vnode in
 vrelel(), otherwise punting and letting the next release-to-0 reclaim it?

So basically ignoring what VOP_INACTIVE() says. I think this would need
another flag so that new vget() against this vnode can drop it early.
Otherwise we could have a livelock with the NFS server always getting
references to the vnode, preventing it from being recycled and
blocking other threads waiting on the inode to reuse it.

 
 Blah, I didn't even want to think about this migrane-inducer now.
 Maybe people who have recently worked on vnode reclaiming could instead
 be the ones to comment?

that would be nice :)

-- 
Manuel Bouyer bou...@antioche.eu.org
 NetBSD: 26 ans d'experience feront toujours la difference
--


Re: CVS commit: src/sys/ufs/ufs

2009-09-22 Thread Antti Kantee
On Tue Sep 22 2009 at 21:58:02 +0200, Manuel Bouyer wrote:
 [explanations]

This is starting to sound sensible now.  You obviously have invested
quite a bit of thought in investigating the problem.

  Blah, I didn't even want to think about this migrane-inducer now.
  Maybe people who have recently worked on vnode reclaiming could instead
  be the ones to comment?
 
 that would be nice :)

You don't like my comments? ;)


Re: CVS commit: src/sys/ufs/ufs

2009-09-22 Thread Manuel Bouyer
On Tue, Sep 22, 2009 at 11:26:21PM +0300, Antti Kantee wrote:
 On Tue Sep 22 2009 at 21:58:02 +0200, Manuel Bouyer wrote:
  [explanations]
 
 This is starting to sound sensible now.  You obviously have invested
 quite a bit of thought in investigating the problem.

Yes, it was annoying enough :)

 
   Blah, I didn't even want to think about this migrane-inducer now.
   Maybe people who have recently worked on vnode reclaiming could instead
   be the ones to comment?
  
  that would be nice :)
 
 You don't like my comments? ;)

Ho I do, and you certainly know more than I do in this area ... :)

-- 
Manuel Bouyer bou...@antioche.eu.org
 NetBSD: 26 ans d'experience feront toujours la difference
--