Hello all, 

I have tried the second patch (tlb2), got the same result - page fault "Unable to 
handle kernel NULL pointer dereference at ..." after rtl_init()  ("RTL started") is 
called in start_kernel() and kernel init thread is created. And few minutes later 
"stuck on TLB IPI wait (CPU#0)" messages appear. 
I am getting a similar result with rtai-1.2a/linux-2.2.14. Page fault occurs whenever 
rt_mount_rtai() is called in init_module(). Interesting detail, few minutes later the 
internal speaker turns ON and stays ON till hard reboot. The system hangs when INIT is 
sending processes the TERM signal  (halt, reboot). 

(See attached file with page fault details for rtlinux and rtai).

So, the problem is really not in TLB IPI, but rather in page fault that occurs (why 
???).
 
I think, we have some profound problem with the hardware. My system has an 82440FX 
(Natoma) chipset, PIIX3 ISA bridge and IDE controller, Intel 82093AA IOAPIC. The 
configuration is quite different from the default configuration in Intel MP V1.4 spec:
APIC base at 0xFEC08000 (default 0xFEE00000)
IOAPIC base at 0xFEC00000 (same as default)
24 registers in IOAPIC (default is 16)
Local APIC ID: 0 (CPU #0)
Local APIC ID: 4 (CPU #4, default 1 ???) 
IOAPIC ID: 13
Local APIC and IOAPIC version 17
Virtual Wire compatibility mode is implemented

Linux is getting all SMP information from BIOS MPtable Ok and mapping local APICs and 
IOAPIC allright.

I was playing with different configuration adjustments (streamlined linux kernel, 
disabled fault resilent boot, redirect PCI interrupts to IOAPIC enabled/disabled, MTRR 
patch, fix for PCI passive release problem in 440FX, double-checking the installation 
etc.), they all seem to be irrelevant.

Rtai-1.2 UP works fine on my system.

I keep digging, for good thing this problem got me interested to learn more about SMP. 
 

Sergey.


----- Original Message ----- 
From: <[EMAIL PROTECTED]>
To: Surya <[EMAIL PROTECTED]>
Cc: <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]>
Sent: Tuesday, April 11, 2000 4:54 PM
Subject: Re: [rtl] SMP error


> 
> Please try the patch 
> 
> pub/rtlinux/v2/test_tlb_fix_patch.2.1.14.B
> 
> And tell me how it works.
> 
> 

Apr 10 09:29:18 dualpro kernel: RTL started
Apr 10 09:29:18 dualpro kernel: Unable to handle kernel NULL pointer dereference at 
virtual address 00000000
Apr 10 09:29:18 dualpro kernel: current->tss.cr3 = 00101000, %cr3 = 00101000
Apr 10 09:29:18 dualpro kernel: *pde = 00000000
Apr 10 09:29:18 dualpro kernel: Oops: 0002
Apr 10 09:29:18 dualpro kernel: CPU:    4
Apr 10 09:29:18 dualpro kernel: EIP:    0010:[<00000000>]
Apr 10 09:29:18 dualpro kernel: EFLAGS: 00010087
Apr 10 09:29:18 dualpro kernel: eax: 00000000   ebx: 00000004   ecx: c7dfa000
edx: 00000018
Apr 10 09:29:18 dualpro kernel: esi: 00000041   edi: c7dfbf7c   ebp: c7dfbf74
esp: c7dfbf5c
Apr 10 09:29:18 dualpro kernel: ds: 0018   es: 0018   ss: 0018
Apr 10 09:29:18 dualpro kernel: Process swapper (pid: 0, process nr: 1, 
stackpage=c7dfb000)
Apr 10 09:29:18 dualpro kernel: Stack: c7dfa000 c7dfa000 c01fe540 00000041 c7dfbf90 
00000000 00000000 c010acfc
Apr 10 09:29:18 dualpro kernel:        c7dfa000 c7dfa000 00000002 c7dfa000 c01fe540 
00000000 00000200 00000018
Apr 10 09:29:18 dualpro kernel:        00000018 00000041 c01079a5 00000010 00000246 
00000000 00000000 00000000
Apr 10 09:29:18 dualpro kernel: Call Trace: [common_smp_interrupt+24/48] 
[cpu_idle+61/80] [do_IRQ+69/72] [rtl_intercept+116/424] [common_interrupt+24/48]
Apr 10 09:29:18 dualpro kernel: Code: <1>Unable to handle kernel NULL pointer 
dereference at virtual address 00000000
Apr 10 09:29:18 dualpro kernel: current->tss.cr3 = 00101000, %cr3 = 00101000
Apr 10 09:29:18 dualpro kernel: *pde = 00000000
Apr 10 09:29:18 dualpro kernel: PCI: PCI BIOS revision 2.10 entry at 0xfd8b1

==============================================================================
***** RTAI NEWLY MOUNTED (MOUNT COUNT 1) ******

Unable to handle kernel NULL pointer dereference at virtual address 00000000
current->tss.cr3 = 05ab7000, %cr3 = 05ab7000
*pde = 00000000
Oops: 0002
CPU:    0
EIP:    0010:[<c80166e2>]
EFLAGS: 00010046
eax: 00000000   ebx: c79a4000   ecx: 00000000   edx: 00000000
esi: 00000000   edi: 00000000   ebp: c79a5f50   esp: c79a5eec
ds: 0018   es: 0018   ss: 0018
Process klogd (pid: 337, process nr: 12, stackpage=c79a5000)
Stack: c79a4000 c79a5f50 00000000 c8015090 00000000 c01e2000 c79a5f50 00000000
       c01e0018 c01e0018 00000000 c01a5407 00000010 00000202 c79a4000 c79a4000
       c79a5f84 0e32102c c7a1e000 c7c37360 00000037 c79a4000 00000000 00000000
Call Trace: [<c8015090>] [<c01a5407>] [<c0116045>] [<c01492a5>] [<c0128882>] 
[<c0109208>]
Code: f0 0f ab 02 19 db 85 db 75 f6 8b a9 20 ac 01 c8 85 ed 74 3c


Reply via email to