Hi,
We've been experiencing applications receiving random SIGBUS errors
and the problem is only reproducible when using AUFS.
Our configuration is a simple NFS readonly root with aufs
used to make '/' writeable. We have reproduced running last
week's aufs (20080505) under a locally modified RHEL5 kernel
(based on 2.6.18-54.1.14.el5).
# wopr2 /root > cat /sys/fs/aufs/config
CONFIG_AUFS=m
CONFIG_AUFS_BRANCH_MAX_127=y
CONFIG_AUFS_SYSAUFS=y
CONFIG_AUFS_PUT_FILP_PATCH=y
CONFIG_AUFS_LHASH_PATCH=y
CONFIG_AUFS_KSIZE_PATCH=y
CONFIG_AUFS_DEBUG=y
CONFIG_AUFS_BR_NFS=y
CONFIG_AUFS_MAGIC_SYSRQ=y
The reproducer runs on an 8-way SMP (4 socket, dual core RevF
Opterons) and is a threaded application. The reproducing app
always fails when faulting on the same address (0x00002aaaaaeef000
in this case), which appears to be in the mapped region for the
/lib64/libm-2.5.so shared library. During tracing, it appears that
all/most of the app's 8 threads fault on this same address at the
same time, so they *all* enter aufs_nopage(). Most of the time
this succeeds, but sometimes aufs_nopage() fails to find the page,
returns NULL, and the app gets SIGBUS.
I don't know aufs very well at all, but I don't see how it prevents
a race when multiple threads enter aufs_nopage() at the same time.
(Is this even possible? I can't tell for sure from my tracing)
We have tried the latest aufs (20080512) as of this morning, and
the problem does not seem to reproduce, but I can't understand why,
since there was no change in aufs_nopage() (though there was a
suspicious un-comment of memory barrier in the unused aufs_fault()).
I have done some amount of tracing using systemtap to try to find
the source of the problem. The trace scripts start tracing when they
see aufs_nopage() called with the failing address. An interesting
point is that the fault of this page doesn not always fail, I've
pasted the trace for the failing case, and a successful case below.
In the trace output below, I print some information about the
arguments passed to the called functions. Here is the explanation,
in particular the decoded file information:
file=<address>[(<type>)<root>:<parent_name>/<name>]
where
address = Address of struct file *
type = file->f_mapping->host->i_sb->s_type->name
root = file->f_mapping->host->i_sb->s_root->d_name.name
name = file->f_dentry->d_name.name
parent_name = file->f_dentry->d_parent->d_name.name
Also, when au_fi() is called, I additionally decode the hidden_file (h_file=)
using something like:
struct au_finfo *finfo = file ? file->private_data : NULL;
if (finfo && finfo->fi_hfile) {
struct file * h_file = finfo->fi_hfile[0 + finfo->fi_bstart].hf_file;
}
as taken from the code in aufs_nopage().
So, here is the trace for the failing case:
0 clomp5_wopr7(6973):->aufs_nopage (vma=0xffff8101ffcd28c8
file=0xffff8101feb044c0(nfs)/:lib64/libm-2.5.so addr=0x00002aaaaaeef000)
46 clomp5_wopr7(6973): ->au_fi
(file=0xffff8101feb04dc0[(aufs)/:lib64/libm-2.5.so],
h_file=0xffff8101feb04dc0[(aufs)/:lib64/libm-2.5.so]
110 clomp5_wopr7(6973): ->filemap_nopage
(file=0xffff8101feb04dc0[(aufs)/:lib64/libm-2.5.so] addr=0x00002aaaaaeef000)
124 clomp5_wopr7(6973): ->find_get_page (offset=36 owner=aufs)
130 clomp5_wopr7(6973): <-find_get_page (0xffff81010375aee8
(flags=0x01f8300000000020))
144 clomp5_wopr7(6973): ->aufs_readpage
153 clomp5_wopr7(6973): <-aufs_readpage rc=0
160 clomp5_wopr7(6973): ->aufs_readpage
164 clomp5_wopr7(6973): <-aufs_readpage rc=0
168 clomp5_wopr7(6973): <-filemap_nopage (rc=0x0000000000000000)
172 clomp5_wopr7(6973): <-aufs_nopage (rc=0x0000000000000000)
6973: handle_mm_fault rc = VM_FAULT_SIGBUS
SIGBUS sent to clomp5_wopr7 (tid=6968) by clomp5_wopr7 (tid=6973) uid:6885
Note that when the vma enters aufs_nopage(), its vm_file seems to be
pointing to the underlying nfs file? Is this because another thread
is racing against aufs_nopage() as well? Note also when filemap_nopage()
calls find_get_page() the mapping owner is still aufs, so no page is
found.
Here is a successful fault for the address of interest -- a different
thread in the same run on the app:
0 clomp5_wopr7(6975):->aufs_nopage (vma=0xffff8101ffcd28c8
file=0xffff8101feb04dc0(aufs)/:lib64/libm-2.5.so addr=0x00002aaaaaeef000)
24 clomp5_wopr7(6975): ->au_fi
(file=0xffff8101feb04dc0[(aufs)/:lib64/libm-2.5.so],
h_file=0xffff8101feb04dc0[(aufs)/:lib64/libm-2.5.so]
68 clomp5_wopr7(6975): ->filemap_nopage
(file=0xffff8101feb044c0[(nfs)/:lib64/libm-2.5.so] addr=0x00002aaaaaeef000)
86 clomp5_wopr7(6975): ->find_get_page (offset=36 owner=nfs)
117 clomp5_wopr7(6975): <-find_get_page (0xffff81000b44d3d8
(flags=0x00c808000000006c))
131 clomp5_wopr7(6975): <-filemap_nopage (rc=0xffff81000b44d3d8)
143 clomp5_wopr7(6975): <-aufs_nopage (rc=0xffff81000b44d3d8)
Now the sb_type is aufs as I'd expect, and the hidden file points to
the NFS file. We call find_get_page() with a mapping owner of nfs
and a page is successfully found.
Can anyone help me understand this problem? If this has been fixed in
today's aufs release, I'd also very much like to understand how! :-)
Sorry for the lengthy mail.
Thanks,
mark
-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/