I'm having a problem with one of our servers that may or may not be related to AUFS. I have several other machines on our network with the exact same setup that don't have any problems at all. Basically I'm having a lot of processes backup waiting on disk, shooting I/O wait through the roof. We've been down many paths with this one machine trying to troubleshoot it's load issues, but after trying our newer aufs2 kernel on it (it used to be booted with aufs1 and 2.6.25), I noticed that everytime disk I/O locks up, I get this kernel oops:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000068 IP: [<ffffffff803f27d7>] aufs_flush+0x97/0x130 PGD 3388a2067 PUD 342a28067 PMD 0 Oops: 0000 [#1] SMP last sysfs file: /sys/kernel/uevent_seqnum CPU 2 Pid: 26001, comm: find Not tainted 2.6.29-xeon-aufs2.29-grsec #1 X7DBU RIP: 0010:[<ffffffff803f27d7>] [<ffffffff803f27d7>] aufs_flush+0x97/0x130 RSP: 0018:ffff88034d27bed8 EFLAGS: 00010286 RAX: 0000000000000000 RBX: ffff8803bfce50c0 RCX: ffff88029c3db828 RDX: ffff8802aaaeb3a0 RSI: 0000000000000000 RDI: ffff88029c3db824 RBP: ffff88034d27bf18 R08: 0000000000000004 R09: 0000000000010000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001 R13: ffff880306777780 R14: ffff88029c3f66c0 R15: 0000000000000001 FS: 00007f15e98bc6d0(0000) GS:ffff88042f81c200(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000068 CR3: 00000003c616b000 CR4: 00000000000406e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process find (pid: 26001, threadinfo ffff88034d27a000, task ffff880350176710) Stack: ffff880306777780 ffff880347951080 000000004d27bf28 ffff880306777780 ffff880347951080 ffff880306777780 0000000000000000 000000000059f0b0 ffff88034d27bf48 ffffffff8028c607 ffff88034d27bf78 ffff880347951080 Call Trace: [<ffffffff8028c607>] filp_close+0x37/0x90 [<ffffffff8028ddd6>] sys_close+0x96/0xe0 [<ffffffff8020281b>] system_call_fastpath+0x16/0x1b Code: 62 28 c7 45 d4 00 00 00 00 45 38 e7 7c 36 48 8b 52 20 49 0f be c4 48 c1 e0 04 48 8b 1c 10 48 85 db 0f 84 7e 00 00 00 48 8b 43 20 <48> 8b 40 68 48 85 c0 74 71 48 8b 75 c8 48 89 df ff d0 85 c0 89 RIP [<ffffffff803f27d7>] aufs_flush+0x97/0x130 RSP <ffff88034d27bed8> CR2: 0000000000000068 ---[ end trace 8941305b7d492947 ]--- I'm hoping that someone can shed some light on the matter. I'm not sure if that aufs_flush is a cause or an effect. Pertinent system information (I'll send the entire kernel .config if you think it will help). /proc/mounts: rootfs / rootfs rw 0 0 none /mnt/root_base/sys sysfs rw 0 0 none /mnt/root_base/proc proc rw 0 0 udev /mnt/root_base/dev tmpfs rw,size=10240k,mode=755 0 0 10.104.11.252:/vol/boot/netboot/etch64-peon /mnt/root_base nfs ro,vers=3,rsize=32768,wsize=32768,namlen=255,acregmin=300,acregmax=600,acdirmin=300,acdirmax=600,hard,nointr,nolock,proto=tcp,timeo=7,retrans=3,sec=sys,addr=10.104.11.252 0 0 10.104.11.252:/vol/boot/netboot/etch64-peon /mnt/root_base/dev/.static/dev nfs ro,vers=3,rsize=32768,wsize=32768,namlen=255,acregmin=300,acregmax=600,acdirmin=300,acdirmax=600,hard,nointr,nolock,proto=tcp,timeo=7,retrans=3,sec=sys,addr=10.104.11.252 0 0 /dev/sda1 /mnt/local ext3 rw,errors=continue,data=ordered 0 0 none / aufs rw,si=679a245f70b0df51 0 0 tmpfs /lib/init/rw tmpfs rw,nosuid,mode=755 0 0 proc /proc proc rw,nosuid,nodev,noexec 0 0 sysfs /sys sysfs rw,nosuid,nodev,noexec 0 0 none /dev/.static/dev aufs rw,si=679a245f70b0df51 0 0 tmpfs /dev tmpfs rw,size=10240k,mode=755 0 0 tmpfs /dev/shm tmpfs rw,nosuid,nodev 0 0 devpts /dev/pts devpts rw,nosuid,noexec,gid=5,mode=620 0 0 /dev/sda2 /tmp ext3 rw,noexec,noatime,nodiratime,errors=continue,commit=300,data=ordered 0 0 /dev/sda6 /usr/local/var/spool/cron/crontabs ext3 rw,nosuid,nodev,errors=continue,data=ordered 0 0 /dev/sdb1 /home ext3 rw,nosuid,nodev,noatime,nodiratime,errors=remount-ro,data=writeback 0 0 rpc_pipefs /usr/local/var/lib/nfs/rpc_pipefs rpc_pipefs rw 0 0 binfmt_misc /proc/sys/fs/binfmt_misc binfmt_misc rw 0 0 /sys/module/aufs/parameters/brs = 1 /sys/module/aufs/parameters/nwkq = 4 /sys/fs/aufs/si_679a245f70b0df51/br0 = /mnt/local=rw /sys/fs/aufs/si_679a245f70b0df51/br1 = /mnt/root_base=ro /sys/fs/aufs/si_679a245f70b0df51/xi_path = /mnt/local/.aufs.xino kernel version: 2.6.29-xeon-aufs2.29-grsec aufs version: aufs 2-29 Aufs .config options: CONFIG_AUFS_FS=y CONFIG_AUFS_BRANCH_MAX_127=y # CONFIG_AUFS_BRANCH_MAX_511 is not set # CONFIG_AUFS_BRANCH_MAX_1023 is not set # CONFIG_AUFS_BRANCH_MAX_32767 is not set CONFIG_AUFS_HINOTIFY=y # CONFIG_AUFS_EXPORT is not set # CONFIG_AUFS_BR_RAMFS is not set # CONFIG_AUFS_DEBUG is not set CONFIG_AUFS_BDEV_LOOP=y CONFIG_AUFS_INO_T_64=y ------------------------------------------------------------------------------ This SF.net email is sponsored by: High Quality Requirements in a Collaborative Environment. Download a free trial of Rational Requirements Composer Now! http://p.sf.net/sfu/www-ibm-com