I'm currently seeing an issue with the following mount:

/tmp on /tmp type tmpfs (rw)
aufs on /var type aufs (rw,xino=/tmp/var/.aufs.xino,br:/tmp/var=rw:/var=ro)
aufs on /etc type aufs (rw,xino=/tmp/etc/.aufs.xino,br:/tmp/etc=rw:/etc=ro)
/tmp on /tmp type tmpfs (rw)

Where / (and thereby /var & /etc) is a read only nfs mounts.

We keep finding processes stuck in the D state when accessing /var/log, (this
will get most of the load for aufs) using a 20080317 unionfs snapshot against
2.6.24. Although it also happened with aufs-20080129.

The NFS root is writable by a single particular host.

$ cat /sys/fs/aufs/config 
CONFIG_AUFS=m
CONFIG_AUFS_BRANCH_MAX_127=y
CONFIG_AUFS_SYSAUFS=y
CONFIG_AUFS_RR_SQUASHFS=y
CONFIG_AUFS_SEC_PERM_PATCH=y
CONFIG_AUFS_PUT_FILP_PATCH=y
CONFIG_AUFS_LHASH_PATCH=y
CONFIG_AUFS_KSIZE_PATCH=y

$ cat /sys/fs/aufs/stat 
wkq max_busy: 1 0 0 0, 0(generic)

$ cat /sys/fs/aufs/brs  
aufs /var ffff81046a56f400 br:/tmp/var=rw:/var=ro
aufs /etc ffff81046a56f800 br:/tmp/etc=rw:/etc=ro

$ cat /sys/fs/aufs/ffff81046a56f400/xino 
8x4096 4096
0: 1, 88x4096 197464
1: 1, 1320x4096 1324584

Here is an example trace of a blocked task:

sh            D ffffffff804297c0     0  7560   7559
 ffff8103d01e39d8 0000000000000086 0000000000000000 adacabacacacaaa8
 bcc0beb7b3b2b0af ffff81030ced4800 ffffffff804dc4a0 ffff81030ced4a50
 00000000ffffffff a1a1a4a4a7a7a7a9 0000000000000000 0000000000000000
Call Trace:
 [<ffffffff80415c22>] __down_read+0x82/0x9a
 [<ffffffff881c3ab6>] :aufs:di_read_lock+0x1c/0x5c
 [<ffffffff881c2ff5>] :aufs:aufs_d_revalidate+0x65/0x6d4
 [<ffffffff8813528d>] :sunrpc:put_rpccred+0x34/0xf5
 [<ffffffff8817a489>] :nfs:nfs_permission+0x156/0x163
 [<ffffffff803130c7>] __up_read+0x13/0x8a
 [<ffffffff803130c7>] __up_read+0x13/0x8a
 [<ffffffff803130c7>] __up_read+0x13/0x8a
 [<ffffffff881c8667>] :aufs:aufs_permission+0x2b0/0x2fd
 [<ffffffff80314eb7>] copy_user_generic_string+0x17/0x40
 [<ffffffff802a94ab>] __d_lookup+0xaa/0x10e
 [<ffffffff80297969>] get_unused_fd_flags+0x72/0x120
 [<ffffffff802a0190>] do_lookup+0x157/0x1ae
 [<ffffffff802a1c49>] __link_path_walk+0x35e/0xd4e
 [<ffffffff802a2691>] link_path_walk+0x58/0xe0
 [<ffffffff80314eb7>] copy_user_generic_string+0x17/0x40
 [<ffffffff80297969>] get_unused_fd_flags+0x72/0x120
 [<ffffffff802a2a01>] do_path_lookup+0x1a3/0x21b
 [<ffffffff802a33ff>] __path_lookup_intent_open+0x56/0x96
 [<ffffffff802a35c4>] open_namei+0xb3/0x63c
 [<ffffffff803130c7>] __up_read+0x13/0x8a
 [<ffffffff80297c7e>] do_filp_open+0x1c/0x38
 [<ffffffff80314eb7>] copy_user_generic_string+0x17/0x40
 [<ffffffff80297969>] get_unused_fd_flags+0x72/0x120
 [<ffffffff80297ce0>] do_sys_open+0x46/0xc3
 [<ffffffff8020be2e>] system_call+0x7e/0x83

Once this happens, any further process that tries to access the
particular file or directory joins the list of blocked tasks, however
other directories and files on the same aufs mount usually work fine.

Any thoughts?

..david


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

Reply via email to