Greetings,
I have been running kdb v4.4-2.6.9 on 2.6.9 kernel for two weeks.
I have found the following problems:
1. When the kernel panic, the bt command does not show the
correct stack trace and the arguments on the calling chain.
I did not have this problem with kdb-v4.3-2.4.21 on
2.4.21 kernel. I patched the kernel with kdb and there was
no failure.
for example,
When the kernel panic, The console showed the following output,
inode hda7:2555908 at cdbed370: mode 40755, nlink 0, next 0
Assertion failure in ext3_put_super() at fs/ext3/super.c:418:
"list_empty(&sbi->s_orphan)"
------------[ cut here ]------------
kernel BUG at fs/ext3/super.c:418!
invalid operand: 0000 [#1]
Modules linked in: kdbm_task kdbm_pg kdbm_vm kdbm_x86 mfs nfs lockd md5 ipv6
parport_pc lp parport autofs4 i2c_dev i2c_core sunrpc dm_mirror dm_mod button
battery ac ohci_hcd ehci_hcd snd_intel8x0 snd_ac97_codec snd_pcm_oss
snd_mixer_oss snd_pcm snd_timer snd_page_alloc snd_mpu401_uart snd_rawmidi
snd_seq_device snd soundcore sis900 floppy ext3 jbd
CPU: 0
EIP: 0060:[<de82a786>] Tainted: P VLI
EFLAGS: 00010292 (2.6.9-kdb)
EIP is at ext3_put_super+0x166/0x190 [ext3]
eax: 0000005e ebx: ddf74b68 ecx: c03d78c0 edx: 00000000
esi: ddf74ab4 edi: dbb6fad4 ebp: cebbaedc esp: cebbaebc
ds: 007b es: 007b ss: 0068
Process umount (pid: 3739, threadinfo=cebba000 task=c1595790)
Stack: de832a30 de83133b de831a21 000001a2 de831a06 dbb6fad4 dbb6fb38 de83b7c0
cebbaf0c c0183a0a 00052c00 ddf61900 00000000 cebbaf1c c015ad6f ddf61c80
00000000 dbb6fad4 dc38b9d8 08051bda cebbaf1c c018572a dbb6fad4 de83b980
Call Trace:
[<c0106e6a>] show_stack+0x7a/0x90
[<c0106fe9>] show_registers+0x149/0x1c0
[<c0107236>] die+0x126/0x2a0
[<c0107717>] do_invalid_op+0xd7/0x100
[<c0106931>] error_code+0x2d/0x38
[<c0183a0a>] generic_shutdown_super+0x1ba/0x260
[<c018572a>] kill_block_super+0x1a/0x40
[<c0183503>] deactivate_super+0x93/0xf0
[<c01a35c8>] __mntput+0x28/0x40
[<c018dd48>] path_release_on_umount+0x28/0x30
[<c01a4187>] sys_umount+0x37/0x80
[<c01a41e7>] sys_oldumount+0x17/0x20
[<c0106735>] sysenter_past_esp+0x52/0x71
Code: 83 de b8 06 1a 83 de 89 44 24 10 b8 a2 01 00 00 89 44 24 0c b8 21 1a 83
de
89 44 24 08 b8 3b 13 83 de 89 44 24 04 e8 3a c1 8f e1 <0f> 0b a2 01 21 1a 83 de
e9 31 ff ff ff 89 f8 89 f2 e8 e4 fd ff
Entering kdb (current=0xc1595790, pid 3739) Oops: invalid operand
due to oops @ 0xde82a786
eax = 0x0000005e ebx = 0xddf74b68 ecx = 0xc03d78c0 edx = 0x00000000
esi = 0xddf74ab4 edi = 0xdbb6fad4 esp = 0xcebbaebc eip = 0xde82a786
ebp = 0xcebbaedc xss = 0x00000068 xcs = 0x00000060 eflags = 0x00010292
xds = 0x0000007b xes = 0x0000007b origeax = 0xffffffff ®s = 0xcebbae88
kdb>
Then, I entered 'bt' to show the calling chains, but only the last call
was displayed and the argument were not corrrect.
kdb> bt
Stack traceback for pid 3739
0xc1595790 3739 3337 1 0 R 0xc1595a10 *umount
EBP EIP Function (args)
0xcebbaedc 0xde82a786 [ext3]ext3_put_super+0x166 (0x5b, 0x4943, 0x499e, 0x499e)
0xcebbae18 0xc01265bb __call_console_drivers+0x4b (0x499e, 0x499e)
0xcebbae28 0xc0126659 _call_console_drivers+0x79 (0x30000002, 0x499e, 0x4940,
0x286, 0xcebbae74)
0xcebbae40 0xc01266bf call_console_drivers+0x5f (0xcebbae88, 0x0, 0xddf74b68,
0xc03d78c0, 0x0)
0xc0106931 error_code+0x2d
Interrupt registers:
eax = 0xdbb6fad4 ebx = 0xcebbae88 ecx = 0x00000000 edx = 0xddf74b68
esi = 0xc03d78c0 edi = 0x00000000 esp = 0x00000060 eip = 0x0000007b
ebp = 0xddf74ab4 xss = 0x00010292 xcs = 0xffffffff eflags = 0xde82a786
xds = 0xcebbaedc xes = 0x0000005e origeax = 0x0000007b ®s = 0xcebbae80
Interrupt from user space, end of kernel trace
kdb>
The above is just one of many cases that the kdb can't display the right
calling chains and the arguments, either in panic situation or when I hit
<ESC>KDB on my tty console port to force the kernel to kdp.
What is the probelm ?
2. I config the SPIN_LOCK on kernel hacking, so the flag is
CONFIG_DEBUG_SPINLOCK=y
When there is some problem with spin_lock or spin_unlock, the
kernel only display the following info and does not go into BUG().
lib/dec_and_lock.c:32: spin_lock(fs/dcache.c:c03dc280) already locked by
lib/dec_and_lock.c/32
fs/dcache.c:101: spin_unlock(fs/dcache.c:c03dc280) not locked
lib/dec_and_lock.c:32: spin_lock(fs/dcache.c:c03dc280) already locked
by
lib/dec_and_lock.c/32
fs/dcache.c:101: spin_unlock(fs/dcache.c:c03dc280) not locked
lib/dec_and_lock.c:32: spin_lock(fs/dcache.c:c03dc280) already locked
by
lib/dec_and_lock.c/32
fs/dcache.c:101: spin_unlock(fs/dcache.c:c03dc280) not locked
By checking the src. I found two spinlock.h include files that
handle the spin lock problems differently.
the include/linux/spinlock.h only display the above warning messages.
The include/asm-i386/spinlock.h seems to go to the BUG() when there
is a problem with spinlock or spinunlock.
See the _raw_spin_lock function in both include file.
What is the trick to force the kernel go to BUG() when there is
spinlock/spinunlock problem ?
3. I was able to get the kernel to drop to kdb by doing one keystroke
<Control>A (Control and A are next to each other on my keyboard)
very quickly and easily. But now I have hit <ESC>KDB (4 keystrokes)
to get to kdb. Is there alternative ?
I'll appreciate any comments and suggestions. Thanks,
John W.
---------------------------
Use http://oss.sgi.com/ecartis to modify your settings or to unsubscribe.