On 10/28/09 7:29 PM, [email protected] wrote:
> Hello michael,
>
> michael rodriguez:
>> I netboot debian linux (etch). I mount the OS filesystem over nfs with
>> aufs. kernel version 2.6.29.6 using the aufs patches. I am seeing
>> situations where processes pile up on aufs read locks. The problem
>> occurs on active web servers with io utilization of about 10-20%. The
>> machine stays responsive and can often recover, but the load spikes into
>> the hundreds.
>>
>> here is a partial kernel stack trace from when the problem is occuring.
>> Is there an easy way to avoid this bottleneck?
>
> I've checked the stacktraces and translated these addresses into symbols
> by simple '#define' and cpp(1).
> It shows that nfs_permission() calls aufs_read_lock(). It is impossible.
> How did you mount aufs (on nfs client), and how did you mount and export
> (on nfs server)?
> The stacktrace might be incorrect which depends upon your kernel
> configuration.
>
> Finally I want you to read the aufs README file, and provide these
> information. While you wrote your kernel is 2.6.29.6, ksymoops says
> 2.6.24.4. Which is correct?

2.6.29.6 is correct, I was just running ksymoops on a different machine, 
sorry. Here is the information requested:


- /proc/mounts (instead of the output of mount(8))

rootfs / rootfs rw 0 0
none /mnt/root_base/sys sysfs rw 0 0
none /mnt/root_base/proc proc rw 0 0
udev /mnt/root_base/dev tmpfs rw,size=10240k,mode=755 0 0
10.106.0.5:/vol/boot/netboot/etch64-peon /mnt/root_base nfs
ro,vers=3,rsize=32768,wsize=32768,namlen=255,acregmin=300,acregmax=600,acdirmin=300,acdirmax=600,hard,nointr,nolock,proto=tcp,timeo=7,retrans=3,sec=sys,addr=10.106.0.5
 
0 0
10.106.0.5:/vol/boot/netboot/etch64-peon /mnt/root_base/dev/.static/dev
nfs
ro,vers=3,rsize=32768,wsize=32768,namlen=255,acregmin=300,acregmax=600,acdirmin=300,acdirmax=600,hard,nointr,nolock,proto=tcp,timeo=7,retrans=3,sec=sys,addr=10.106.0.5
 
0 0
/dev/sda1 /mnt/local ext3 rw,errors=continue,data=ordered 0 0
none / aufs rw,si=679a245f70c6e951 0 0
tmpfs /lib/init/rw tmpfs rw,nosuid,mode=755 0 0
proc /proc proc rw,nosuid,nodev,noexec 0 0
sysfs /sys sysfs rw,nosuid,nodev,noexec 0 0
none /dev/.static/dev aufs rw,si=679a245f70c6e951 0 0
tmpfs /dev tmpfs rw,size=10240k,mode=755 0 0
tmpfs /dev/shm tmpfs rw,nosuid,nodev 0 0
devpts /dev/pts devpts rw,nosuid,noexec,gid=5,mode=620 0 0
/dev/sda2 /tmp ext3
rw,noexec,noatime,nodiratime,errors=continue,commit=300,data=ordered 0 0
/dev/sda6 /usr/local/var/spool/cron/crontabs ext3
rw,nosuid,nodev,errors=continue,data=ordered 0 0
/dev/sdb1 /home ext3
rw,nosuid,nodev,noatime,nodiratime,errors=remount-ro,data=writeback 0 0
rpc_pipefs /usr/local/var/lib/nfs/rpc_pipefs rpc_pipefs rw 0 0
binfmt_misc /proc/sys/fs/binfmt_misc binfmt_misc rw 0 0

- /sys/module/aufs/*

indians:~# find /sys/module/aufs -name "*" -type f -print -exec cat {} \;
/sys/module/aufs/parameters/brs
1
/sys/module/aufs/parameters/nwkq
4

- /sys/fs/aufs/* (if you have them)

indians:~# find /sys/fs/aufs -name "*" -type f -print -exec cat {} \;
/sys/fs/aufs/si_679a245f70c6e951/xi_path
/mnt/local/.aufs.xino
/sys/fs/aufs/si_679a245f70c6e951/br0
/mnt/local=rw
/sys/fs/aufs/si_679a245f70c6e951/br1
/mnt/root_base=ro

- /debug/aufs/* (if you have them)

Don't have.

- linux kernel version
   if your kernel is not plain, for example modified by distributor,
   the url where i can download its source is necessary too.

vanilla 2.6.29.y.git tag v2.6.29.6, merged with aufs2-2.6 tag aufs2-29, 
patched with

http://blackprecipice.com/dl/grsecurity-2.1.14-2.6.29.6-200908252018.patch

Complete sources with patches applied can be downloaded here:

http://blackprecipice.com/dl/ndn-2.6.29.6-aufs2-grsec-v1.5.tar.bz2

- aufs version which was printed at loading the module or booting the
   system, instead of the date you downloaded.

aufs2-29

- configuration (define/undefine CONFIG_AUFS_xxx)

indians:~# zgrep AUFS /proc/config.gz
.CONFIG_AUFS_FS=y
CONFIG_AUFS_BRANCH_MAX_127=y
# CONFIG_AUFS_BRANCH_MAX_511 is not set
# CONFIG_AUFS_BRANCH_MAX_1023 is not set
# CONFIG_AUFS_BRANCH_MAX_32767 is not set
CONFIG_AUFS_HINOTIFY=y
# CONFIG_AUFS_EXPORT is not set
# CONFIG_AUFS_RDU is not set
# CONFIG_AUFS_SHWH is not set
# CONFIG_AUFS_BR_RAMFS is not set
# CONFIG_AUFS_BR_FUSE is not set
# CONFIG_AUFS_DEBUG is not set
CONFIG_AUFS_BDEV_LOOP=y

- kernel configuration or /proc/config.gz (if you have it)

http://blackprecipice.com/dl/dotconfig

- behaviour which you think to be incorrect

I am trying to understand what would cause so many processes to be stuck 
waiting.

here is the System.map file:

http://blackprecipice.com/dl/System.map

and the trace of some waiting processes:

http://blackprecipice.com/dl/indians.kernel.trace


Thanks!
michael

http://blackprecipice.com/dl/

------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference

Reply via email to