Corey,

I applied the 2.6.23 IPMI UART system interface patches to my 2.6.23 8641
kernel (serial_core.c & 8250.c).

Now when I do a 'reboot' from the command line I get the following kernel
oops:

Oops: Kernel access of bad area, sig: 11 [#1]
SMP NR_CPUS=2 kxx8641
Modules linked in:
NIP: c01580a4 LR: c016c77c CTR: c016c764
REGS: c1a23a50 TRAP: 0300   Not tainted  (2.6.23-ksi8641-rel_1_0-rc2)
MSR: 00009032 <EE,ME,IR,DR>  CR: 42000242  XER: 20000000
DAR: 000000b8, DSISR: 40000000
TASK = effdac70[281] 'init' THREAD: c1a22000 CPU: 0
GPR00: c016c77c c1a23b00 effdac70 00000000 00009032 fffba466 00000100
c04a0000
GPR08: c1a23b18 c04ca000 0005b000 c016c764 22000244 1008c5ac c04b0000
00000001
GPR16: ffffffff 00000000 c1a23c50 00000000 393d66b8 c04a0000 00000000
c04bd960
GPR24: c04bd960 c04b33a0 c048710c c1a22000 00000000 00000002 c04ca018
00000000
NIP [c01580a4] tty_wakeup+0x14/0x9c
LR [c016c77c] uart_tasklet_action+0x18/0x28
Call Trace:
[c1a23b00] [c016861c] ipmi_serial_timeout+0x0/0x78 (unreliable)
[c1a23b10] [c016c77c] uart_tasklet_action+0x18/0x28
[c1a23b20] [c0029508] tasklet_action+0xbc/0x194
[c1a23b50] [c0029814] __do_softirq+0xa0/0x13c
[c1a23b90] [c00065b4] do_softirq+0x64/0x68
[c1a23ba0] [c0029220] irq_exit+0x54/0x64
[c1a23bb0] [c000e260] timer_interrupt+0x2e4/0x6cc
[c1a23c40] [c0011374] ret_from_except+0x0/0x14
--- Exception: 901 at vprintk+0x25c/0x434
    LR = vprintk+0x2d4/0x434
[c1a23d90] [c0023f7c] printk+0x50/0x60
[c1a23e10] [c0032e14] kernel_restart+0x7c/0x98
[c1a23e20] [c00354ac] sys_reboot+0x170/0x200
[c1a23f40] [c0010cc8] ret_from_syscall+0x0/0x38
--- Exception: c01 at 0xff252fc
    LR = 0x1001c938
Instruction dump:
4becbeb9 80010034 387f000c bbc10028 38210030 7c0803a6 4e800020 7c0802a6
9421fff0 bfc10008 7c7f1b78 90010014 <800300b8> 70090020 4082002c 387f0128
Kernel panic - not syncing: Fatal exception in interrupt
Rebooting in 180 seconds..

The problem is that state->info->tty is NULL and so NULL is passed into the
tty_wakeup() from uart_tasklet_action() and NULL is dereferenced inside
tty_wakeup.

I can make the problem go away in three ways:

a) Remove the printk's in sys.c:kernel_restart()
b) stick a if(state->info->tty) above the call to tty_wakeup() in
   serial_core.c:uart_tasklet_action
c) Remove the state->info->tty = NULL in seral_core.c:uart_close()

'c' stops the kernel oops but something is still wrong such that the board
locks up instead of oopsing.

There appears to be a timing issue between the tty_wakeup (I think caused by
the printk's in kernel_restart) and setting state->info->tty = NULL as a
part of the kernel shutdown.  However, I wasn't getting very far tracking
this down any further and so though I would let you know about it.

David Jenkins
[EMAIL PROTECTED]
-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Openipmi-developer mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/openipmi-developer

Reply via email to