Just one more quick comment. The patch below is only a test patch, and
isn't the real fix. It causes some other problems, for example when
calling aufs_nopage() as result of mlockall() processes will deadlock,
but I hoped it would illustrate the location of the problem and help
determine a correct fix.

Thanks,
mark

> 
> > 
> > We have tried the latest aufs (20080512) as of this morning, and
> > the problem does not seem to reproduce, but I can't understand why,
> > since there was no change in aufs_nopage() (though there was a
> > suspicious un-comment of memory barrier in the unused aufs_fault()).
> 
> Well, the SIGBUS issue did eventually reproduce under the latest aufs
> (20080512). So no magical fixes were in that patch set. ;-)
> 
> As a test, I've applied the following patch to our aufs in order to
> serialize access to aufs_nopage() by taking a  write lock on the 
> mmap_sem. 600 runs so far on this kernel without triggering the
> issue. (Last test reproduced before run 300). However, I'll let the
> testing run overnight.
> 
> I'm not sure if the memory barrier is still needed, especially when
> the call is being serialized, but I left it in for now for symmetry 
> with aufs_fault():
> 
> Index: linux+rh+chaos/fs/aufs/f_op.c
> ===================================================================
> --- linux+rh+chaos.orig/fs/aufs/f_op.c
> +++ linux+rh+chaos/fs/aufs/f_op.c
> @@ -411,6 +411,12 @@ static struct page *aufs_nopage(struct v
>       static DECLARE_WAIT_QUEUE_HEAD(wq);
>       struct au_finfo *finfo;
>  
> +     /* 
> +      *   Need to get write lock as we'll update vma
> +      */
> +     up_read (&vma->vm_mm->mmap_sem);
> +     down_write (&vma->vm_mm->mmap_sem);
> +
>       AuTraceEnter();
>       AuDebugOn(!vma || !vma->vm_file);
>       wait_event(wq, (file = au_robr_safe_file(vma)));
> @@ -425,7 +431,7 @@ static struct page *aufs_nopage(struct v
>       h_file = finfo->fi_hfile[0 + finfo->fi_bstart].hf_file;
>       AuDebugOn(!h_file || !au_test_mmapped(file));
>       vma->vm_file = h_file;
> -     //smp_mb();
> +     smp_mb();
>       page = finfo->fi_h_vm_ops->nopage(vma, addr, type);
>       //file->f_ra = h_file->f_ra; //??
>       au_robr_reset_file(vma, file);
> @@ -444,6 +450,10 @@ static struct page *aufs_nopage(struct v
>               //inode->i_atime = h_file->f_dentry->d_inode->i_atime;
>       }
>       AuTraceErrPtr(page);
> +     /*
> +      *  Reassert read lock
> +      */
> +     downgrade_write (&vma->vm_mm->mmap_sem);
>       return page;
>  }
>  
> 
> 
> -------------------------------------------------------------------------
> This SF.net email is sponsored by: Microsoft 
> Defy all challenges. Microsoft(R) Visual Studio 2008. 
> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
> 


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft 
Defy all challenges. Microsoft(R) Visual Studio 2008. 
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

Reply via email to