Re: CVS commit: src/sys/ufs/ufs
On Sun, Sep 20, 2009 at 08:23:38PM +0300, Antti Kantee wrote: In ufs_ihashget(), vget() can return a vnode that has been vclean'ed because vget() can sleep. After vget returns, check that vp is still connected with ip, and that ip still points to the inode we want. This fix the NULL pointer dereference in ufs_fhtovp() I've been seeing on a NFS server. Um, hold the phone. The whole point of vget() is to provide race-free access to the weak vnode reference held by the file system. Are you saying this does not hold anymore? It depends on what you mean with race-free. If you mean that the vnode returned by vget() can't be recygled, I think this is true. If you mean that vget() can't return a clean vnode then this is false: vget() can sleep in vn_lock(), and it releases the v_interlock mutex before sleeping. While sleeping vclean() can VOP_RECLAIM() the vnode, even if v_usecount is 1. -- Manuel Bouyer bou...@antioche.eu.org NetBSD: 26 ans d'experience feront toujours la difference --
Re: CVS commit: src/sys/ufs/ufs
On Tue Sep 22 2009 at 20:06:40 +0200, Manuel Bouyer wrote: On Sun, Sep 20, 2009 at 08:23:38PM +0300, Antti Kantee wrote: In ufs_ihashget(), vget() can return a vnode that has been vclean'ed because vget() can sleep. After vget returns, check that vp is still connected with ip, and that ip still points to the inode we want. This fix the NULL pointer dereference in ufs_fhtovp() I've been seeing on a NFS server. Um, hold the phone. The whole point of vget() is to provide race-free access to the weak vnode reference held by the file system. Are you saying this does not hold anymore? It depends on what you mean with race-free. If you mean that the vnode returned by vget() can't be recygled, I think this is true. If you mean that vget() can't return a clean vnode then this is false: vget() can sleep in vn_lock(), and it releases the v_interlock mutex before sleeping. While sleeping vclean() can VOP_RECLAIM() the vnode, even if v_usecount is 1. What is the practical difference of cleaned and recycled for the file system driver? If there is a race in vfs and XLOCK is not used properly, I think that should be investigated and fixed instead of patching file systems here and there.
Re: CVS commit: src/sys/ufs/ufs
On Tue, Sep 22, 2009 at 09:23:05PM +0300, Antti Kantee wrote: On Tue Sep 22 2009 at 20:06:40 +0200, Manuel Bouyer wrote: On Sun, Sep 20, 2009 at 08:23:38PM +0300, Antti Kantee wrote: In ufs_ihashget(), vget() can return a vnode that has been vclean'ed because vget() can sleep. After vget returns, check that vp is still connected with ip, and that ip still points to the inode we want. This fix the NULL pointer dereference in ufs_fhtovp() I've been seeing on a NFS server. Um, hold the phone. The whole point of vget() is to provide race-free access to the weak vnode reference held by the file system. Are you saying this does not hold anymore? It depends on what you mean with race-free. If you mean that the vnode returned by vget() can't be recygled, I think this is true. If you mean that vget() can't return a clean vnode then this is false: vget() can sleep in vn_lock(), and it releases the v_interlock mutex before sleeping. While sleeping vclean() can VOP_RECLAIM() the vnode, even if v_usecount is 1. What is the practical difference of cleaned and recycled for the file system driver? From what I understand the vnode is not on the free list yet so it can't be reused for something else. If there is a race in vfs and XLOCK is not used properly, I think that should be investigated and fixed instead of patching file systems here and there. I don't know how XLOCK is supposed to be used (and I even less know how to change things without creating deadlocks). As I see it XLOCK is set while the vnode is being cleaned, not before or after cleaning it, so XLOCK is not going to avoid this race anyway. I asked for help on tech-kern about this but didn't get any reply ... -- Manuel Bouyer bou...@antioche.eu.org NetBSD: 26 ans d'experience feront toujours la difference --
Re: CVS commit: src/sys/ufs/ufs
On Tue Sep 22 2009 at 21:04:14 +0200, Manuel Bouyer wrote: that's not an issue with the reference count. It's an issue with vclean() calling VOP_RECLAIM() even if the refcount is greater than 1, and vrelel() calling vclean() even if the refcount is greater than 1, when the file has been unlink(2)ed. There's a comment about this in vrelel(), look for variable recycle. Ah, ic, probably would've been easier if I read the PR first ;) So basically someone can vget the vnode (via fhtovp, since it's gone from the directory namespace) between VOP_INACTIVE and clean. Your fix doesn't really fix this problem, since depending on timings the inode might still be recycled after you check it's valid. Hmm, ok, so things which might happen: 1: VOP_REMOVE, vnode is locked, vrele is called 2: fhtovp before inode is reclaimed, blocks on vn_lock 1: VOP_INACTIVE releases vnlock, returns signal to recycle vnode 2: gets lock, does check that it's still the same inode 1: reclaims vnode 2: boom Since vget takes the interlock, a dirty cheap trick might be to check that the reference count is still one before trying to clean the vnode in vrelel(), otherwise punting and letting the next release-to-0 reclaim it? Blah, I didn't even want to think about this migrane-inducer now. Maybe people who have recently worked on vnode reclaiming could instead be the ones to comment?
Re: CVS commit: src/sys/ufs/ufs
On Tue, Sep 22, 2009 at 10:42:59PM +0300, Antti Kantee wrote: On Tue Sep 22 2009 at 21:04:14 +0200, Manuel Bouyer wrote: that's not an issue with the reference count. It's an issue with vclean() calling VOP_RECLAIM() even if the refcount is greater than 1, and vrelel() calling vclean() even if the refcount is greater than 1, when the file has been unlink(2)ed. There's a comment about this in vrelel(), look for variable recycle. Ah, ic, probably would've been easier if I read the PR first ;) So basically someone can vget the vnode (via fhtovp, since it's gone from the directory namespace) between VOP_INACTIVE and clean. Your fix doesn't really fix this problem, since depending on timings the inode might still be recycled after you check it's valid. I don't think so because at this point the vnode is locked. I think the race window is between vn_lock() releasing the interlock and getting the vn_lock. After that the reference count is high enouth to prevent vrelel() trying to clean it (vtryrele() at the start of vrelel will make it return, and we hold the interlock at this point). Hmm, ok, so things which might happen: 1: VOP_REMOVE, vnode is locked, vrele is called 2: fhtovp before inode is reclaimed, blocks on vn_lock It would fist block on the interlock at this point, I guess. 1: VOP_INACTIVE releases vnlock, returns signal to recycle vnode 2: gets lock, does check that it's still the same inode 1: reclaims vnode 2: boom Since vget takes the interlock, a dirty cheap trick might be to check that the reference count is still one before trying to clean the vnode in vrelel(), otherwise punting and letting the next release-to-0 reclaim it? So basically ignoring what VOP_INACTIVE() says. I think this would need another flag so that new vget() against this vnode can drop it early. Otherwise we could have a livelock with the NFS server always getting references to the vnode, preventing it from being recycled and blocking other threads waiting on the inode to reuse it. Blah, I didn't even want to think about this migrane-inducer now. Maybe people who have recently worked on vnode reclaiming could instead be the ones to comment? that would be nice :) -- Manuel Bouyer bou...@antioche.eu.org NetBSD: 26 ans d'experience feront toujours la difference --
Re: CVS commit: src/sys/ufs/ufs
On Tue Sep 22 2009 at 21:58:02 +0200, Manuel Bouyer wrote: [explanations] This is starting to sound sensible now. You obviously have invested quite a bit of thought in investigating the problem. Blah, I didn't even want to think about this migrane-inducer now. Maybe people who have recently worked on vnode reclaiming could instead be the ones to comment? that would be nice :) You don't like my comments? ;)
Re: CVS commit: src/sys/ufs/ufs
On Tue, Sep 22, 2009 at 11:26:21PM +0300, Antti Kantee wrote: On Tue Sep 22 2009 at 21:58:02 +0200, Manuel Bouyer wrote: [explanations] This is starting to sound sensible now. You obviously have invested quite a bit of thought in investigating the problem. Yes, it was annoying enough :) Blah, I didn't even want to think about this migrane-inducer now. Maybe people who have recently worked on vnode reclaiming could instead be the ones to comment? that would be nice :) You don't like my comments? ;) Ho I do, and you certainly know more than I do in this area ... :) -- Manuel Bouyer bou...@antioche.eu.org NetBSD: 26 ans d'experience feront toujours la difference --