Just one more quick comment. The patch below is only a test patch, and isn't the real fix. It causes some other problems, for example when calling aufs_nopage() as result of mlockall() processes will deadlock, but I hoped it would illustrate the location of the problem and help determine a correct fix.
Thanks, mark > > > > > We have tried the latest aufs (20080512) as of this morning, and > > the problem does not seem to reproduce, but I can't understand why, > > since there was no change in aufs_nopage() (though there was a > > suspicious un-comment of memory barrier in the unused aufs_fault()). > > Well, the SIGBUS issue did eventually reproduce under the latest aufs > (20080512). So no magical fixes were in that patch set. ;-) > > As a test, I've applied the following patch to our aufs in order to > serialize access to aufs_nopage() by taking a write lock on the > mmap_sem. 600 runs so far on this kernel without triggering the > issue. (Last test reproduced before run 300). However, I'll let the > testing run overnight. > > I'm not sure if the memory barrier is still needed, especially when > the call is being serialized, but I left it in for now for symmetry > with aufs_fault(): > > Index: linux+rh+chaos/fs/aufs/f_op.c > =================================================================== > --- linux+rh+chaos.orig/fs/aufs/f_op.c > +++ linux+rh+chaos/fs/aufs/f_op.c > @@ -411,6 +411,12 @@ static struct page *aufs_nopage(struct v > static DECLARE_WAIT_QUEUE_HEAD(wq); > struct au_finfo *finfo; > > + /* > + * Need to get write lock as we'll update vma > + */ > + up_read (&vma->vm_mm->mmap_sem); > + down_write (&vma->vm_mm->mmap_sem); > + > AuTraceEnter(); > AuDebugOn(!vma || !vma->vm_file); > wait_event(wq, (file = au_robr_safe_file(vma))); > @@ -425,7 +431,7 @@ static struct page *aufs_nopage(struct v > h_file = finfo->fi_hfile[0 + finfo->fi_bstart].hf_file; > AuDebugOn(!h_file || !au_test_mmapped(file)); > vma->vm_file = h_file; > - //smp_mb(); > + smp_mb(); > page = finfo->fi_h_vm_ops->nopage(vma, addr, type); > //file->f_ra = h_file->f_ra; //?? > au_robr_reset_file(vma, file); > @@ -444,6 +450,10 @@ static struct page *aufs_nopage(struct v > //inode->i_atime = h_file->f_dentry->d_inode->i_atime; > } > AuTraceErrPtr(page); > + /* > + * Reassert read lock > + */ > + downgrade_write (&vma->vm_mm->mmap_sem); > return page; > } > > > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
