When processes are still accessing afs (e.g., cwd is /afs/...) at the
time a shutdown is occurring things get messy.
The afs shutdown fails to umount /afs (EBUSY) and fails to unload the
module (EBUSY). Then several more umount attempts fail (still EBUSY),
the process gets killed and an oops results for any subsequent umount
/afs. (See attached.)
I've noticed this under Red Hat 7.1 with Red Hat's "stock" 2.4.[367]
kernels and OpenAFS 1.2.1. I have not tried other distributions or
kernels. I don't recall the "oops" behavior happening under these
circumstances with OpenAFS prior to 1.2.1.
Ordinarily the problem happens at shutdown time sufficiently late that
klogd isn't still running so there's nothing to run ksymoops on, so
I've reproduced the problem artificially.
(Make sure a process has cwd of /afs/...)
# /etc/init.d/afs stop
Stopping AFS services.....
inetd.afs: no process killed
umount: /afs: device is busy
libafs-2.4.7-2-i686: Device or resource busy
(kill the process whose cwd was /afs/...)
# umount /afs
(watch oops as it occurs)
When this happens at shutdown a power-cycle is required.
I wish I had the expertise to track this down further myself, but alas
I can't do that without guidance. The problem is about 90%
reproducible as shown above, and decreasing as I continue to exersize
the problem... I don't know why sometimes I don't get an oops.
When doing it by hand, make sure to umount /afs at the end and rmmod
libafs or the box will hang unmounting filesystems when you shutdown.
Rudy
ksymoops 2.4.0 on i686 2.4.7-2. Options used
-V (default)
-k /proc/ksyms (default)
-l /proc/modules (default)
-o /lib/modules/2.4.7-2/ (default)
-m /boot/System.map-2.4.7-2 (default)
Warning: You did not tell me where to find symbol information. I will
assume that the log matches the kernel and modules that are running
right now and I'll use the default options above for symbol resolution.
If the current kernel and/or modules do not match the log, you can get
more accurate output by telling me the kernel version and where to find
map, modules, ksyms etc. ksymoops -h explains the options.
Warning (compare_maps): ksyms_base symbol __VERSIONED_SYMBOL(ataraid_gendisk) not
found in System.map. Ignoring ksyms_base entry
Warning (compare_maps): ksyms_base symbol __VERSIONED_SYMBOL(ataraid_get_device) not
found in System.map. Ignoring ksyms_base entry
Warning (compare_maps): ksyms_base symbol __VERSIONED_SYMBOL(ataraid_register_disk)
not found in System.map. Ignoring ksyms_base entry
Warning (compare_maps): ksyms_base symbol __VERSIONED_SYMBOL(ataraid_release_device)
not found in System.map. Ignoring ksyms_base entry
Warning (compare_maps): mismatch on symbol partition_name , ksyms_base says c01b7600,
System.map says c0157550. Ignoring ksyms_base entry
Warning (compare_maps): mismatch on symbol usb_devfs_handle , usbcore says e106f5a0,
/lib/modules/2.4.7-2/kernel/drivers/usb/usbcore.o says e106f0c0. Ignoring
/lib/modules/2.4.7-2/kernel/drivers/usb/usbcore.o entry
Oct 8 10:22:38 gti kernel: 3c59x: Donald Becker and others.
www.scyld.com/network/vortex.html
Oct 8 10:41:51 gti kernel: 3c59x: Donald Becker and others.
www.scyld.com/network/vortex.html
Oct 8 15:19:40 gti kernel: ac97_codec: AC97 Audio codec, id: 0x8384:0x7609 (SigmaTel
STAC9721/23)
Oct 8 15:46:19 gti kernel: ac97_codec: AC97 Audio codec, id: 0x8384:0x7609 (SigmaTel
STAC9721/23)
Oct 8 17:55:05 gti kernel: 3c59x: Donald Becker and others.
www.scyld.com/network/vortex.html
Oct 8 17:58:18 gti kernel: 3c59x: Donald Becker and others.
www.scyld.com/network/vortex.html
Oct 8 18:59:03 gti kernel: 3c59x: Donald Becker and others.
www.scyld.com/network/vortex.html
Oct 8 21:31:26 gti kernel: ac97_codec: AC97 Audio codec, id: 0x8384:0x7609 (SigmaTel
STAC9721/23)
Oct 8 23:48:39 gti kernel: ac97_codec: AC97 Audio codec, id: 0x8384:0x7609 (SigmaTel
STAC9721/23)
Oct 9 10:15:22 gti kernel: 3c59x: Donald Becker and others.
www.scyld.com/network/vortex.html
Oct 9 10:41:06 gti kernel: kernel BUG at slab.c:1570!
Oct 9 10:41:06 gti kernel: invalid operand: 0000
Oct 9 10:41:06 gti kernel: CPU: 0
Oct 9 10:41:06 gti kernel: EIP: 0010:[kmem_cache_free+87/704]
Oct 9 10:41:06 gti kernel: EIP: 0010:[<c012dca7>]
Using defaults from ksymoops -t elf32-i386 -a i386
Oct 9 10:41:06 gti kernel: EFLAGS: 00010282
Oct 9 10:41:06 gti kernel: eax: 0000001b ebx: e11c1fe0 ecx: 00000001 edx:
00001f86
Oct 9 10:41:06 gti kernel: esi: 008cb744 edi: e11c1fe0 ebp: 00000001 esp:
dc79feb4
Oct 9 10:41:06 gti kernel: ds: 0018 es: 0018 ss: 0018
Oct 9 10:41:06 gti kernel: Process umount (pid: 1296, stackpage=dc79f000)
Oct 9 10:41:06 gti kernel: Stack: c021b79c 00000622 c1b10134 c01264b5 e11c1fe0
e11c1fe0 e11c1fe0 00000001
Oct 9 10:41:06 gti kernel: c014a2fa c1b8832c e11c1fe0 e11c1fe0 c014b96b
e11c1fe0 dd4d955c 00000246
Oct 9 10:41:06 gti kernel: df5c4394 df5bde04 df1cd3ac df1cd394 c01495c3
e11c1fe0 e11c1fe0 c0141690
Oct 9 10:41:06 gti kernel: Call Trace: [copyrite+16636/36756]
[handle_mm_fault+101/224] [<e11c1fe0>] [<e11c1fe0>] [<e11c1fe0>]
Oct 9 10:41:06 gti kernel: Call Trace: [<c021b79c>] [<c01264b5>] [<e11c1fe0>]
[<e11c1fe0>] [<e11c1fe0>]
Oct 9 10:41:06 gti kernel: [<c014a2fa>] [<e11c1fe0>] [<e11c1fe0>] [<c014b96b>]
[<e11c1fe0>] [<c01495c3>]
Oct 9 10:41:06 gti kernel: [<e11c1fe0>] [<e11c1fe0>] [cached_lookup+16/80]
[<e113fd60>] [<e113fd2c>] [shrink_dcache_parent+22/48]
Oct 9 10:41:06 gti kernel: [<e11c1fe0>] [<e11c1fe0>] [<c0141690>] [<e113fd60>]
[<e113fd2c>] [<c01498c6>]
Oct 9 10:41:06 gti kernel: [<c013caea>] [<c0141678>] [<c013cf7a>] [<c0136f23>]
[<c01275d2>] [<c013cfbc>]
Oct 9 10:41:06 gti kernel: [<c0106f0b>]
Oct 9 10:41:06 gti kernel: Code: 0f 0b 58 5a 8b 15 4c 14 2d c0 eb 0d 90 90 90 90 90
90 90 90
>>EIP; c012dca7 <kmem_cache_free+57/2c0> <=====
Trace; c021b79c <copyrite+40fc/8f94>
Trace; c01264b5 <handle_mm_fault+65/e0>
Trace; e11c1fe0 <END_OF_CODE+73321/????>
Trace; e11c1fe0 <END_OF_CODE+73321/????>
Trace; e11c1fe0 <END_OF_CODE+73321/????>
Trace; c014a2fa <destroy_inode+2a/30>
Trace; e11c1fe0 <END_OF_CODE+73321/????>
Trace; e11c1fe0 <END_OF_CODE+73321/????>
Trace; c014b96b <iput+19b/1b0>
Trace; e11c1fe0 <END_OF_CODE+73321/????>
Trace; c01495c3 <prune_dcache+d3/160>
Trace; e11c1fe0 <END_OF_CODE+73321/????>
Trace; e11c1fe0 <END_OF_CODE+73321/????>
Trace; e11c1fe0 <END_OF_CODE+73321/????>
Trace; e11c1fe0 <END_OF_CODE+73321/????>
Trace; c0141690 <cached_lookup+10/50>
Trace; e113fd60 <[libafs-2.4.7-2-i686]rcsid+40/120>
Trace; e113fd2c <[libafs-2.4.7-2-i686]rcsid+c/120>
Trace; c01498c6 <shrink_dcache_parent+16/30>
Trace; c013caea <kill_super+4a/100>
Trace; c0141678 <path_release+28/30>
Trace; c013cf7a <sys_umount+da/110>
Trace; c0136f23 <filp_close+53/60>
Trace; c01275d2 <sys_munmap+32/50>
Trace; c013cfbc <sys_oldumount+c/10>
Trace; c0106f0b <system_call+33/38>
Code; c012dca7 <kmem_cache_free+57/2c0>
00000000 <_EIP>:
Code; c012dca7 <kmem_cache_free+57/2c0> <=====
0: 0f 0b ud2a <=====
Code; c012dca9 <kmem_cache_free+59/2c0>
2: 58 pop %eax
Code; c012dcaa <kmem_cache_free+5a/2c0>
3: 5a pop %edx
Code; c012dcab <kmem_cache_free+5b/2c0>
4: 8b 15 4c 14 2d c0 mov 0xc02d144c,%edx
Code; c012dcb1 <kmem_cache_free+61/2c0>
a: eb 0d jmp 19 <_EIP+0x19> c012dcc0
<kmem_cache_free+70/2c0>
Code; c012dcb3 <kmem_cache_free+63/2c0>
c: 90 nop
Code; c012dcb4 <kmem_cache_free+64/2c0>
d: 90 nop
Code; c012dcb5 <kmem_cache_free+65/2c0>
e: 90 nop
Code; c012dcb6 <kmem_cache_free+66/2c0>
f: 90 nop
Code; c012dcb7 <kmem_cache_free+67/2c0>
10: 90 nop
Code; c012dcb8 <kmem_cache_free+68/2c0>
11: 90 nop
Code; c012dcb9 <kmem_cache_free+69/2c0>
12: 90 nop
Code; c012dcba <kmem_cache_free+6a/2c0>
13: 90 nop
7 warnings issued. Results may not be reliable.