On Sun, Jan 31, 2010 at 12:35:16AM +0100, Manuel Bouyer wrote: > Hi, > while investigating directory corruption on my NFS server I found > a possible issue with the buffer cache. > > The buffer cache keeps a hash of buf_t, the key being (vp, bp->b_lblkno). > This allows, for a read or write, to find if a buffer for this block > has already been allocated and if data is already in code (eventually). > buffers in this hash should also be in one of the vnode's buffer list, > and the vnode should be on the hold list (so it gets recycled after > vnodes with no buffer). > > Now if vnode_free_list is empty, getcleanvnode() will remove one from > vnode_hold_list and vclean() it. vclean() will flush and release buffers > from the vnode lists, but won't remove the buffer from the (vp, bp->b_lblkno) > hash. This clean vnode will get associated with a new inode and > if the I/O started on it has the same lblkno, the buffer will be > found in the hash, eventually providing data from the previous inode, > or writing to the previous inode. This is very rare, because > vnode from the hold list are not reused often and there is usually > less buffers than vnodes in the system (so a buffer is likely recycled > before a vnode is). Also, VREG vnode usually don't use the buffer cache. > > I think vclean() should also take care of removing the vnode from > the buffer cache's hash. Comments ?
I missed something in my analysis: vclean() invalidate buffers which, among other things, calls brelsel(bp, BC_INVAL | BC_VFLUSH). So the buffer is marked BC_INVAL which should cause brelvp() to be called, which sets b->b_vp to NULL. So incore() won't return this buffer, because b_vp doens't match, and because the buffer is marked BC_INVAL. my corruption issue is somewhere else :( -- Manuel Bouyer <[email protected]> NetBSD: 26 ans d'experience feront toujours la difference --
