Re: New interrupt stuff breaks ASUS 2 CPU system

2003-11-11 Thread John Baldwin

On 11-Nov-2003 John Hay wrote:
>> >> >> With the new interrupt code I get:
>> >> >> <...>
>> >> >> OK boot
>> >> >> cpuid = 0; apic id = 00
>> >> >> instruction pointer = 0x0:0xa00
>> >> >> stack pointer   = 0x0:0xffe
>> >> >> frame pointer   = 0x0:0x0
>> >> >> code segment= base 0x0, limit 0x0, type 0x0
>> >> >> = DPL 0, pres 0, def32 0, gran 0
>> >> >> processor eflags= interrupt enabled, vm86, IOPL = 0
>> >> >> current process = 0 ()
>> >> >> kernel: type 30 trap, code=0
>> >> >> Stopped at  0xa00:  cli
>> >> >> db> tr
>> >> >> (null)(0,0,0,0,0) at 0xa00
>> >> >> <...>
>> >> >> 
>> >> >> However, if I enter 'continue' at the DDB prompt it continues to boot
>> >> >> and the system seems to runs fine:
>> >> >> 
>> >> >> <...>
>> >> >> db> continue
>> >> > ...
>> >> >> Copyright (c) 1992-2003 The FreeBSD Project.
>> >> >> <...>
>> >> >> 
>> >> > 
>> >> > Now why didn't I think of trying 'continue'? Hey there my old dual
>> >> > Pentium I diskless machine is running in SMP mode.
>> >> 
>> >> Can you try this patch:
>> >> 
>> >> http://www.FreeBSD.org/~jhb/patches/atpic.patch
>> > 
>> > Ah, great, continue is not needed anymore. Now to see if someone can
>> > figure out why my dual PII get a "panic: probing for non-PCI bus" when
>> > booting. :-)
>> 
>> Actually, can you try spurious.patch (same URL directory) instead and
>> see if that is sufficient to fix it?
> 
> Nope, this behaves the same as without the patches, ie. I have to type
> continue.

Grrr, that's really bogus then. :(

-- 

John Baldwin <[EMAIL PROTECTED]>  <><  http://www.FreeBSD.org/~jhb/
"Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: New interrupt stuff breaks ASUS 2 CPU system

2003-11-10 Thread John Hay
> >> >> With the new interrupt code I get:
> >> >> <...>
> >> >> OK boot
> >> >> cpuid = 0; apic id = 00
> >> >> instruction pointer = 0x0:0xa00
> >> >> stack pointer   = 0x0:0xffe
> >> >> frame pointer   = 0x0:0x0
> >> >> code segment= base 0x0, limit 0x0, type 0x0
> >> >> = DPL 0, pres 0, def32 0, gran 0
> >> >> processor eflags= interrupt enabled, vm86, IOPL = 0
> >> >> current process = 0 ()
> >> >> kernel: type 30 trap, code=0
> >> >> Stopped at  0xa00:  cli
> >> >> db> tr
> >> >> (null)(0,0,0,0,0) at 0xa00
> >> >> <...>
> >> >> 
> >> >> However, if I enter 'continue' at the DDB prompt it continues to boot
> >> >> and the system seems to runs fine:
> >> >> 
> >> >> <...>
> >> >> db> continue
> >> > ...
> >> >> Copyright (c) 1992-2003 The FreeBSD Project.
> >> >> <...>
> >> >> 
> >> > 
> >> > Now why didn't I think of trying 'continue'? Hey there my old dual
> >> > Pentium I diskless machine is running in SMP mode.
> >> 
> >> Can you try this patch:
> >> 
> >> http://www.FreeBSD.org/~jhb/patches/atpic.patch
> > 
> > Ah, great, continue is not needed anymore. Now to see if someone can
> > figure out why my dual PII get a "panic: probing for non-PCI bus" when
> > booting. :-)
> 
> Actually, can you try spurious.patch (same URL directory) instead and
> see if that is sufficient to fix it?

Nope, this behaves the same as without the patches, ie. I have to type
continue.

John
-- 
John Hay -- [EMAIL PROTECTED] / [EMAIL PROTECTED]
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: New interrupt stuff breaks ASUS 2 CPU system

2003-11-10 Thread John Baldwin

On 10-Nov-2003 John Hay wrote:
> On Mon, Nov 10, 2003 at 02:12:56PM -0500, John Baldwin wrote:
>> 
>> On 10-Nov-2003 John Hay wrote:
>> >> 
>> >> With the new interrupt code I get:
>> >> <...>
>> >> OK boot
>> >> cpuid = 0; apic id = 00
>> >> instruction pointer = 0x0:0xa00
>> >> stack pointer   = 0x0:0xffe
>> >> frame pointer   = 0x0:0x0
>> >> code segment= base 0x0, limit 0x0, type 0x0
>> >> = DPL 0, pres 0, def32 0, gran 0
>> >> processor eflags= interrupt enabled, vm86, IOPL = 0
>> >> current process = 0 ()
>> >> kernel: type 30 trap, code=0
>> >> Stopped at  0xa00:  cli
>> >> db> tr
>> >> (null)(0,0,0,0,0) at 0xa00
>> >> <...>
>> >> 
>> >> However, if I enter 'continue' at the DDB prompt it continues to boot
>> >> and the system seems to runs fine:
>> >> 
>> >> <...>
>> >> db> continue
>> > ...
>> >> Copyright (c) 1992-2003 The FreeBSD Project.
>> >> <...>
>> >> 
>> > 
>> > Now why didn't I think of trying 'continue'? Hey there my old dual
>> > Pentium I diskless machine is running in SMP mode.
>> 
>> Can you try this patch:
>> 
>> http://www.FreeBSD.org/~jhb/patches/atpic.patch
> 
> Ah, great, continue is not needed anymore. Now to see if someone can
> figure out why my dual PII get a "panic: probing for non-PCI bus" when
> booting. :-)

Actually, can you try spurious.patch (same URL directory) instead and
see if that is sufficient to fix it?

-- 

John Baldwin <[EMAIL PROTECTED]>  <><  http://www.FreeBSD.org/~jhb/
"Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: New interrupt stuff breaks ASUS 2 CPU system

2003-11-10 Thread John Hay
On Mon, Nov 10, 2003 at 02:12:56PM -0500, John Baldwin wrote:
> 
> On 10-Nov-2003 John Hay wrote:
> >> 
> >> With the new interrupt code I get:
> >> <...>
> >> OK boot
> >> cpuid = 0; apic id = 00
> >> instruction pointer = 0x0:0xa00
> >> stack pointer   = 0x0:0xffe
> >> frame pointer   = 0x0:0x0
> >> code segment= base 0x0, limit 0x0, type 0x0
> >> = DPL 0, pres 0, def32 0, gran 0
> >> processor eflags= interrupt enabled, vm86, IOPL = 0
> >> current process = 0 ()
> >> kernel: type 30 trap, code=0
> >> Stopped at  0xa00:  cli
> >> db> tr
> >> (null)(0,0,0,0,0) at 0xa00
> >> <...>
> >> 
> >> However, if I enter 'continue' at the DDB prompt it continues to boot
> >> and the system seems to runs fine:
> >> 
> >> <...>
> >> db> continue
> > ...
> >> Copyright (c) 1992-2003 The FreeBSD Project.
> >> <...>
> >> 
> > 
> > Now why didn't I think of trying 'continue'? Hey there my old dual
> > Pentium I diskless machine is running in SMP mode.
> 
> Can you try this patch:
> 
> http://www.FreeBSD.org/~jhb/patches/atpic.patch

Ah, great, continue is not needed anymore. Now to see if someone can
figure out why my dual PII get a "panic: probing for non-PCI bus" when
booting. :-)

John
-- 
John Hay -- [EMAIL PROTECTED] / [EMAIL PROTECTED]
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: New interrupt stuff breaks ASUS 2 CPU system

2003-11-10 Thread Marius Strobl
On Mon, Nov 10, 2003 at 02:12:56PM -0500, John Baldwin wrote:
> 
> On 10-Nov-2003 John Hay wrote:
> >> 
> >> With the new interrupt code I get:
> >> <...>
> >> OK boot
> >> cpuid = 0; apic id = 00
> >> instruction pointer = 0x0:0xa00
> >> stack pointer   = 0x0:0xffe
> >> frame pointer   = 0x0:0x0
> >> code segment= base 0x0, limit 0x0, type 0x0
> >> = DPL 0, pres 0, def32 0, gran 0
> >> processor eflags= interrupt enabled, vm86, IOPL = 0
> >> current process = 0 ()
> >> kernel: type 30 trap, code=0
> >> Stopped at  0xa00:  cli
> >> db> tr
> >> (null)(0,0,0,0,0) at 0xa00
> >> <...>
> >> 
> >> However, if I enter 'continue' at the DDB prompt it continues to boot
> >> and the system seems to runs fine:
> >> 
> >> <...>
> >> db> continue
> > ...
> >> Copyright (c) 1992-2003 The FreeBSD Project.
> >> <...>
> >> 
> > 
> > Now why didn't I think of trying 'continue'? Hey there my old dual
> > Pentium I diskless machine is running in SMP mode.
> 
> Can you try this patch:
> 
> http://www.FreeBSD.org/~jhb/patches/atpic.patch
> 

Works here, thanks!
Btw., I also get such a stray interrupt on my Sun U60, IIRC also from the
printer port :)

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: New interrupt stuff breaks ASUS 2 CPU system

2003-11-10 Thread John Baldwin

On 10-Nov-2003 John Hay wrote:
>> 
>> With the new interrupt code I get:
>> <...>
>> OK boot
>> cpuid = 0; apic id = 00
>> instruction pointer = 0x0:0xa00
>> stack pointer   = 0x0:0xffe
>> frame pointer   = 0x0:0x0
>> code segment= base 0x0, limit 0x0, type 0x0
>> = DPL 0, pres 0, def32 0, gran 0
>> processor eflags= interrupt enabled, vm86, IOPL = 0
>> current process = 0 ()
>> kernel: type 30 trap, code=0
>> Stopped at  0xa00:  cli
>> db> tr
>> (null)(0,0,0,0,0) at 0xa00
>> <...>
>> 
>> However, if I enter 'continue' at the DDB prompt it continues to boot
>> and the system seems to runs fine:
>> 
>> <...>
>> db> continue
> ...
>> Copyright (c) 1992-2003 The FreeBSD Project.
>> <...>
>> 
> 
> Now why didn't I think of trying 'continue'? Hey there my old dual
> Pentium I diskless machine is running in SMP mode.

Can you try this patch:

http://www.FreeBSD.org/~jhb/patches/atpic.patch

-- 

John Baldwin <[EMAIL PROTECTED]>  <><  http://www.FreeBSD.org/~jhb/
"Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: New interrupt stuff breaks ASUS 2 CPU system

2003-11-10 Thread John Baldwin

On 10-Nov-2003 Marius Strobl wrote:
> On Thu, Nov 06, 2003 at 12:22:45PM -0500, John Baldwin wrote:
>> 
>> On 06-Nov-2003 Harti Brandt wrote:
>> > JB>I figured out what is happenning I think.  You are getting a spurious
>> > JB>interrupt from the 8259A PIC (which comes in on IRQ 7).  The IRR register
>> > JB>lists pending interrupts still waiting to be serviced.  Try using
>> > JB>'options NO_MIXED_MODE' to stop using the 8259A's for the clock and see if
>> > JB>the spurious IRQ 7 interrupts go away.
>> > 
>> > Ok, that seems to help. Interesting although why do these interrupts
>> > happen only with a larger HZ and when the kernel is doing printfs (this
>> > machine has a serial console). I have also not tried to disable SIO2 and
>> > the parallel port.
>> 
>> Can you also try turning mixed mode back on and using
>> http://www.FreeBSD.org/~jhb/patches/spurious.patch
>> 
>> You should get some stray IRQ 7's in the vmstat -i output as well as a few
>> printf's to the kernel console.
>> 
> 
> I think I'm seeing something related here, with the old interrupt code I
> got:
> <...>
> Hit [Enter] to boot immediately, or any other key for command prompt.
> Booting [/boot/kernel/kernel]...   
> ACPI autoload failed - no such file or directory
> stray irq 7
> ^^^
> Copyright (c) 1992-2003 The FreeBSD Project.

Peter has seen this on an amd64 machine.  It seems we can get an interrupt
from the AT PIC before we get a chance to program the PICs to mask all their
inputs.

> <...>
> 
> With the new interrupt code I get:
> <...>
> OK boot
> cpuid = 0; apic id = 00
> instruction pointer = 0x0:0xa00
> stack pointer   = 0x0:0xffe
> frame pointer   = 0x0:0x0
> code segment= base 0x0, limit 0x0, type 0x0
> = DPL 0, pres 0, def32 0, gran 0
> processor eflags= interrupt enabled, vm86, IOPL = 0
> current process = 0 ()
> kernel: type 30 trap, code=0
> Stopped at  0xa00:  cli
> db> tr
> (null)(0,0,0,0,0) at 0xa00
> <...>
> 
> However, if I enter 'continue' at the DDB prompt it continues to boot
> and the system seems to runs fine:
> 
> <...>
> db> continue
> SMAP type=01 base= len=0009f400
> SMAP type=02 base=0009f400 len=0c00
> SMAP type=02 base=000d len=0003
> SMAP type=01 base=0010 len=1fdf
> SMAP type=03 base=1fef len=f000
> SMAP type=04 base=1feff000 len=1000
> SMAP type=01 base=1ff0 len=0008
> SMAP type=02 base=1ff8 len=0008
> SMAP type=02 base=fec0 len=4000
> SMAP type=02 base=fee0 len=1000
> SMAP type=02 base=fff8 len=0008
> Copyright (c) 1992-2003 The FreeBSD Project.
> <...>
> 
> Neiter the spurious interrupt patch nor setting 'options NO_MIXED_MODE'
> makes a difference. This is on a Tyan Tiger MPX S2466N-4M board, a full
> verbose boot log is at: http://quad.zeist.de/newintr.log
> 

-- 

John Baldwin <[EMAIL PROTECTED]>  <><  http://www.FreeBSD.org/~jhb/
"Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: New interrupt stuff breaks ASUS 2 CPU system

2003-11-10 Thread John Hay
> 
> With the new interrupt code I get:
> <...>
> OK boot
> cpuid = 0; apic id = 00
> instruction pointer = 0x0:0xa00
> stack pointer   = 0x0:0xffe
> frame pointer   = 0x0:0x0
> code segment= base 0x0, limit 0x0, type 0x0
> = DPL 0, pres 0, def32 0, gran 0
> processor eflags= interrupt enabled, vm86, IOPL = 0
> current process = 0 ()
> kernel: type 30 trap, code=0
> Stopped at  0xa00:  cli
> db> tr
> (null)(0,0,0,0,0) at 0xa00
> <...>
> 
> However, if I enter 'continue' at the DDB prompt it continues to boot
> and the system seems to runs fine:
> 
> <...>
> db> continue
...
> Copyright (c) 1992-2003 The FreeBSD Project.
> <...>
> 

Now why didn't I think of trying 'continue'? Hey there my old dual
Pentium I diskless machine is running in SMP mode.

John
-- 
John Hay -- [EMAIL PROTECTED] / [EMAIL PROTECTED]
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: New interrupt stuff breaks ASUS 2 CPU system

2003-11-10 Thread Marius Strobl
On Thu, Nov 06, 2003 at 12:22:45PM -0500, John Baldwin wrote:
> 
> On 06-Nov-2003 Harti Brandt wrote:
> > JB>I figured out what is happenning I think.  You are getting a spurious
> > JB>interrupt from the 8259A PIC (which comes in on IRQ 7).  The IRR register
> > JB>lists pending interrupts still waiting to be serviced.  Try using
> > JB>'options NO_MIXED_MODE' to stop using the 8259A's for the clock and see if
> > JB>the spurious IRQ 7 interrupts go away.
> > 
> > Ok, that seems to help. Interesting although why do these interrupts
> > happen only with a larger HZ and when the kernel is doing printfs (this
> > machine has a serial console). I have also not tried to disable SIO2 and
> > the parallel port.
> 
> Can you also try turning mixed mode back on and using
> http://www.FreeBSD.org/~jhb/patches/spurious.patch
> 
> You should get some stray IRQ 7's in the vmstat -i output as well as a few
> printf's to the kernel console.
> 

I think I'm seeing something related here, with the old interrupt code I
got:
<...>
Hit [Enter] to boot immediately, or any other key for command prompt.
Booting [/boot/kernel/kernel]...   
ACPI autoload failed - no such file or directory
stray irq 7
^^^
Copyright (c) 1992-2003 The FreeBSD Project.
<...>

With the new interrupt code I get:
<...>
OK boot
cpuid = 0; apic id = 00
instruction pointer = 0x0:0xa00
stack pointer   = 0x0:0xffe
frame pointer   = 0x0:0x0
code segment= base 0x0, limit 0x0, type 0x0
= DPL 0, pres 0, def32 0, gran 0
processor eflags= interrupt enabled, vm86, IOPL = 0
current process = 0 ()
kernel: type 30 trap, code=0
Stopped at  0xa00:  cli
db> tr
(null)(0,0,0,0,0) at 0xa00
<...>

However, if I enter 'continue' at the DDB prompt it continues to boot
and the system seems to runs fine:

<...>
db> continue
SMAP type=01 base= len=0009f400
SMAP type=02 base=0009f400 len=0c00
SMAP type=02 base=000d len=0003
SMAP type=01 base=0010 len=1fdf
SMAP type=03 base=1fef len=f000
SMAP type=04 base=1feff000 len=1000
SMAP type=01 base=1ff0 len=0008
SMAP type=02 base=1ff8 len=0008
SMAP type=02 base=fec0 len=4000
SMAP type=02 base=fee0 len=1000
SMAP type=02 base=fff8 len=0008
Copyright (c) 1992-2003 The FreeBSD Project.
<...>

Neiter the spurious interrupt patch nor setting 'options NO_MIXED_MODE'
makes a difference. This is on a Tyan Tiger MPX S2466N-4M board, a full
verbose boot log is at: http://quad.zeist.de/newintr.log

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: New interrupt stuff breaks ASUS 2 CPU system

2003-11-07 Thread Bruce Evans
On Fri, 7 Nov 2003, Stefan [iso-8859-1] Eßer wrote:

> On 2003-11-07 20:04 +1100, Bruce Evans <[EMAIL PROTECTED]> wrote:
> > However, using the apic almost doubles the overheads for the a45 cases.
> > This seems to be due to extra interrupts.  The UART and/or driver already
>
> Just another data point:
>
> Seems that the interrupt rate doubled for drm0 on my system
> (from 60 to 120 driving a LCD at 60Hz vertical refresh).
>
> I thought this might be a problem with shared interrupts (drm0
> and xl0 shared APIC IRQ 16), but removing the (actually unused)
> xl driver did not make a difference ...

Hmm.  My a45 UARTs are the only ones with a pci level triggered interrupt:

Nov  7 01:48:44 gamplex kernel: ioapic0: Routing IRQ 5 -> intpin 19
Nov  7 01:48:44 gamplex kernel: ioapic0: intpin 5 disabled
Nov  7 01:48:44 gamplex kernel: ioapic0: intpin 19 trigger: level
Nov  7 01:48:44 gamplex kernel: ioapic0: intpin 19 polarity: active-lo

There is only one other level triggered interrupt the system that is
used:

Nov  7 01:48:44 gamplex kernel: ioapic0: Routing IRQ 11 -> intpin 18
Nov  7 01:48:44 gamplex kernel: ioapic0: intpin 11 disabled
Nov  7 01:48:44 gamplex kernel: ioapic0: intpin 18 trigger: level
Nov  7 01:48:44 gamplex kernel: ioapic0: intpin 18 polarity: active-lo

and I suspect it may be doing strange things too: I found that rev.1.23
of ata_lowlevel.c broke atapicam, but the new interrupt code magically
fixed it.  One of the atapicam devices is the only device on IRQ11.

Bruce
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


RE: New interrupt stuff breaks ASUS 2 CPU system

2003-11-07 Thread Harti Brandt
On Thu, 6 Nov 2003, John Baldwin wrote:

JB>
JB>On 06-Nov-2003 Harti Brandt wrote:
JB>> JB>I figured out what is happenning I think.  You are getting a spurious
JB>> JB>interrupt from the 8259A PIC (which comes in on IRQ 7).  The IRR register
JB>> JB>lists pending interrupts still waiting to be serviced.  Try using
JB>> JB>'options NO_MIXED_MODE' to stop using the 8259A's for the clock and see if
JB>> JB>the spurious IRQ 7 interrupts go away.
JB>>
JB>> Ok, that seems to help. Interesting although why do these interrupts
JB>> happen only with a larger HZ and when the kernel is doing printfs (this
JB>> machine has a serial console). I have also not tried to disable SIO2 and
JB>> the parallel port.
JB>
JB>Can you also try turning mixed mode back on and using
JB>http://www.FreeBSD.org/~jhb/patches/spurious.patch
JB>
JB>You should get some stray IRQ 7's in the vmstat -i output as well as a few
JB>printf's to the kernel console.

Now I'm getting the same 'Couldn't get vector from ISR!' as before on
Xapic_isr1. Again ISR1 is 0 and IRR1 is 0x100.

Here is some data:

db> trace
Debugger(c05ea5f4,0,c05fa63b,c0821b5c,100) at Debugger+0x55
panic(c05fa63b,c0821b6c,c062ab80,c0821bb4,c05ab57d) at panic+0x156
lapic_handle_intr() at lapic_handle_intr+0x1b
Xapic_isr1() at Xapic_isr1+0x3d
--- interrupt, eip = 0xc04bbbfd, esp = 0xc0821bb0, ebp = 0xc0821bb4 ---
critical_exit(c0821bf4,c059af49,c0638100,0,c05f7a08) at critical_exit+0x2d
_mtx_unlock_spin_flags(c0638100,0,c05f7a08,c88,c0821bec) at _mtx_unlock_spin_flags+0x23
siocnputc(c061e8e0,a,5,c0821d10,a) at siocnputc+0xe9
cnputc(a,2060d900,1,0,c05eec77) at cnputc+0x7a
putchar(a,c0821d10,1,0,0) at putchar+0x6c
kvprintf(c05eec76,c04d46b0,c0821d10,a,c0821d30) at kvprintf+0x8d
printf(c05eec76,0,,0,c05c6e20) at printf+0x57
tc_init(c0622c60,c0821d78,c05c7b8f,8,8) at tc_init+0xc4
init_TSC_tc(8,8,c05c6e20,0,a0) at init_TSC_tc+0x91
cpu_initclocks(c0821d98,c0490ac5,0,81e000,81ec00) at cpu_initclocks+0x11f
initclocks(0,81e000,81ec00,81e000,0) at initclocks+0x8
mi_startup() at mi_startup+0xb5
begin() at begin+0x2c
db> x *lapic+0x110
0xd78f8110: 0
db> x *lapic+0x210
0xd78f8210: 100

IRQ7 is the parallel port according to dmesg.

harti
-- 
harti brandt,
http://www.fokus.fraunhofer.de/research/cc/cats/employees/hartmut.brandt/private
[EMAIL PROTECTED], [EMAIL PROTECTED]
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


RE: New interrupt stuff breaks ASUS 2 CPU system

2003-11-07 Thread Bruce Evans
On Thu, 6 Nov 2003, John Baldwin wrote:

> On 06-Nov-2003 Harti Brandt wrote:
> > JB>I figured out what is happenning I think.  You are getting a spurious
> > JB>interrupt from the 8259A PIC (which comes in on IRQ 7).  The IRR register
> > JB>lists pending interrupts still waiting to be serviced.  Try using
> > JB>'options NO_MIXED_MODE' to stop using the 8259A's for the clock and see if
> > JB>the spurious IRQ 7 interrupts go away.
> >
> > Ok, that seems to help. Interesting although why do these interrupts
> > happen only with a larger HZ and when the kernel is doing printfs (this
> > machine has a serial console). I have also not tried to disable SIO2 and
> > the parallel port.
>
> Can you also try turning mixed mode back on and using
> http://www.FreeBSD.org/~jhb/patches/spurious.patch
>
> You should get some stray IRQ 7's in the vmstat -i output as well as a few
> printf's to the kernel console.

Other changes fixed the problem with the apic case not working on my BP6,
except the apic causes many more interrupts on serial ports at 921600 bps,
almost enough to overload the system with just 2 active serial ports.
I've now gathered lots of statistics for sio interrupt performance.  The
bad effect of the apic on performance is shown in the "-current(apic)"
lines for a45 and a45b only:

%%%
Keywords:
c04 = send at 115200 bps on cuac00, receive at 115200 bps on cuac04
c04b = like c04 plus send and receive in other direction too (b = bidirectional)
  (cuac* are on a Cyclades 8yo (2 * cd1400 isa))
a01 = like c04 except use ports cuaa[01]
a01b = like a01 except bidirectional
  (cuaa[01] are standard motherboard 16550 clones)
a45 = like a01 except use speed 921600 bps and ports cuaa[45]
a45b = like a45 except bidirectional
  (cuaa[45] are on a VScom 200HV2 (2 * 16950 pci))
-current(ointr) = -current before new interrupt code
-current = plain current (2003/11/06)
-current(apic) = -current with apic configured for UP kernel on SMP hardware
-current(bde) = my version of -current (new interrupt code not merged yet)
&+iir,+stream,+intr0 = my version of -current with variants of sio
  optimizations (only UART-independent ones; optimizations for 16950 UARTs
  give factor of 2 reduction in overheads)

Overheads for doing above I/O in percent (min-max for 3 runs) on an ABIT BP6
with 366 MHz and 400 MHz Celerons:

Devices OS  UP  SMP
--- --  --  ---
c04 RELENG_4(4.9)   6.58-6.59   Not measured (method problems)
-current(ointr) 9.65-9.76   6.77-7.11
-current10.64-10.69 6.09-6.36
-current(apic)  9.63-9.90   As above (apic standard)
-current(bde)   6.83-6.96   3.54-3.78
c04bRELENG_4(4.9)   12.83-12.90 Not measured (method problems)
-current(ointr) 19.42-19.44 13.70-13.90
-current20.23-20.24 12.01-12.48
-current(apic)  17.77-17.89 As above (apic standard)
-current(bde)   12.74-13.23 6.23-6.53
a01 RELENG_4(4.9)   7.50-7.50   Not measured (method problems)
-current(ointr) 7.67-7.69   4.44-4.77
-current8.09-8.13   4.72-5.60
-current(apic)  7.75-8.02   As above (apic standard)
-current(bde)   7.53-7.63   4.49-4.54
&+iir   7.09-7.30   Not measured (kernel problems)
&+stream6.23-6.24
&+iir+stream5.47-5.52
&+intr0+iir 5.24-5.26   2.75-2.91
a01bRELENG_4(4.9)   14.64-14.84 Not measured (method problems)
-current(ointr) 14.36-15.10 8.65-8.92
-current14.79-14.87 8.18-9.77
-current(apic)  14.80-14.91 As above (apic standard)
-current(bde)   14.19-14.24 8.13-8.46
&+iir   14.05-14.13
&+stream12.12-12.17
&+iir+stream10.58-10.62
&+intr0+iir 10.07-10.12 5.10-5.63
a45 RELENG_4(4.9)   21.81-21.86 Not measured (method problems)
-current(ointr) 24.00-24.04 13.3
-current25.13-25.20 31.4-31.5(86)
-current(apic)  51.02-51.05(87) As above (apic standard)
-current(bde)   21.83-22.02 10.71-10.89
&+iir   21.98-22.05
&+stream27.78-27.81
&+iir+stream22.08-22.16
&+intr0+iir 16.76-16.92 6.85-8.11
a45bRELENG_4(4.9)   46.23-46.44(87) Not measured (method problems)
-current(ointr) 54.01-54.37(86) 25.2 (82/82)
-current56.04-56.93(85) 70.1-70.7(80)
-current(apic)  87.35-88.22(78) As above (apic standard)
-current(bde)   42.06-42.12

RE: New interrupt stuff breaks ASUS 2 CPU system

2003-11-06 Thread John Baldwin

On 06-Nov-2003 Harti Brandt wrote:
> JB>I figured out what is happenning I think.  You are getting a spurious
> JB>interrupt from the 8259A PIC (which comes in on IRQ 7).  The IRR register
> JB>lists pending interrupts still waiting to be serviced.  Try using
> JB>'options NO_MIXED_MODE' to stop using the 8259A's for the clock and see if
> JB>the spurious IRQ 7 interrupts go away.
> 
> Ok, that seems to help. Interesting although why do these interrupts
> happen only with a larger HZ and when the kernel is doing printfs (this
> machine has a serial console). I have also not tried to disable SIO2 and
> the parallel port.

Can you also try turning mixed mode back on and using
http://www.FreeBSD.org/~jhb/patches/spurious.patch

You should get some stray IRQ 7's in the vmstat -i output as well as a few
printf's to the kernel console.

-- 

John Baldwin <[EMAIL PROTECTED]>  <><  http://www.FreeBSD.org/~jhb/
"Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


RE: New interrupt stuff breaks ASUS 2 CPU system

2003-11-06 Thread Harti Brandt
On Wed, 5 Nov 2003, John Baldwin wrote:

JB>
JB>On 05-Nov-2003 Harti Brandt wrote:
JB>> On Tue, 4 Nov 2003, John Baldwin wrote:
JB>>
JB>> JB>
JB>> JB>On 04-Nov-2003 Harti Brandt wrote:
JB>> JB>> On Tue, 4 Nov 2003, Harti Brandt wrote:
JB>> JB>>
JB>> JB>> HB>On Tue, 4 Nov 2003, John Baldwin wrote:
JB>> JB>> HB>
JB>> JB>> HB>JB>
JB>> JB>> HB>JB>On 04-Nov-2003 Harti Brandt wrote:
JB>> JB>> HB>JB>>
JB>> JB>> HB>JB>> Hi,
JB>> JB>> HB>JB>>
JB>> JB>> HB>JB>> I have an ASUS system with 2 CPUs that I need to run at HZ=1. 
This
JB>> JB>> HB>JB>> worked until yesterday, but with the new interrupt code it doesn't 
boot
JB>> JB>> HB>JB>> anymore. It works for the standard HZ, but if I set HZ=1000 I get a 
double
JB>> JB>> HB>JB>> fault. I suspect a race condition in the interrupt handling. My 
config
JB>> JB>> HB>JB>> file has
JB>> JB>> HB>JB>>
JB>> JB>> HB>JB>> options SMP
JB>> JB>> HB>JB>> device apic
JB>> JB>> HB>JB>> options HZ=1000
JB>> JB>> HB>JB>
JB>> JB>> HB>JB>Ok, I can try to reproduce.
JB>> JB>> HB>JB>
JB>> JB>> HB>JB>> Device configuration finished.
JB>> JB>> HB>JB>> Timecounter "TSC" frequency 1380009492 Hz quality -100
JB>> JB>> HB>JB>> Timecounters cpuid = 0; apic id = 00
JB>> JB>> HB>JB>> instruction pointer   = 0x8:0xc048995d
JB>> JB>> HB>JB>> stack pointer = 0x10:0xc0821bf4
JB>> JB>> HB>JB>> frame pointercpuid = 0; apic id = 00
JB>> JB>> HB>JB>>
JB>> JB>> HB>JB>> 0xc048995d is in critical_exit. It is the jmp after the popf from
JB>> JB>> HB>JB>> cpu_critical_exit.
JB>> JB>> HB>JB>
JB>> JB>> HB>JB>This is where interrupts are re-enabled, so you are getting an 
interrupt.
JB>> JB>> HB>JB>It might be helpful to figure what type of fault you are actually 
getting.
JB>> JB>> HB>
JB>> JB>> HB>tf_err is 0, tf_trapno is 30 (decimal).
JB>> JB>>
JB>> JB>> More information:
JB>> JB>>
JB>> JB>> I have replaced all the reserved vectors with individual ones, that set
JB>> JB>> tf_err to the index (vector number). It appears the the vector number is
JB>> JB>> 39 decimal. What does that mean?
JB>> JB>
JB>> JB>IRQ 7.
JB>> JB>Can you post a verbose dmesg?  Also, can you try both with and without
JB>> JB>ACPI?
JB>>
JB>> Attached are both dmesgs.
JB>>
JB>> More datapoints:
JB>>
JB>> I had the parallel port (irq7) and the second sio disabled in the BIOS.
JB>> After enabling both I now get a panic in lapic_handle_intr: Couldn't get
JB>> vector from ISR! After fetching the relevant docs from intel I checked the
JB>> registers of the apic pointed to by lapic. The interrupt taken is
JB>> Xapic_irq1. isr1 is zero, but irr1 is 0x100 (that was without ACPI). How
JB>> may that happen? As I understand ISR are the interrupts that have been
JB>> delivered to the CPU so if it is interrupted a bit should be set, correct?
JB>
JB>I figured out what is happenning I think.  You are getting a spurious
JB>interrupt from the 8259A PIC (which comes in on IRQ 7).  The IRR register
JB>lists pending interrupts still waiting to be serviced.  Try using
JB>'options NO_MIXED_MODE' to stop using the 8259A's for the clock and see if
JB>the spurious IRQ 7 interrupts go away.

Ok, that seems to help. Interesting although why do these interrupts
happen only with a larger HZ and when the kernel is doing printfs (this
machine has a serial console). I have also not tried to disable SIO2 and
the parallel port.

Thanks,
harti
-- 
harti brandt,
http://www.fokus.fraunhofer.de/research/cc/cats/employees/hartmut.brandt/private
[EMAIL PROTECTED], [EMAIL PROTECTED]
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


RE: New interrupt stuff breaks ASUS 2 CPU system

2003-11-05 Thread John Baldwin

On 05-Nov-2003 Harti Brandt wrote:
> On Tue, 4 Nov 2003, John Baldwin wrote:
> 
> JB>
> JB>On 04-Nov-2003 Harti Brandt wrote:
> JB>> On Tue, 4 Nov 2003, Harti Brandt wrote:
> JB>>
> JB>> HB>On Tue, 4 Nov 2003, John Baldwin wrote:
> JB>> HB>
> JB>> HB>JB>
> JB>> HB>JB>On 04-Nov-2003 Harti Brandt wrote:
> JB>> HB>JB>>
> JB>> HB>JB>> Hi,
> JB>> HB>JB>>
> JB>> HB>JB>> I have an ASUS system with 2 CPUs that I need to run at HZ=1. This
> JB>> HB>JB>> worked until yesterday, but with the new interrupt code it doesn't boot
> JB>> HB>JB>> anymore. It works for the standard HZ, but if I set HZ=1000 I get a 
> double
> JB>> HB>JB>> fault. I suspect a race condition in the interrupt handling. My config
> JB>> HB>JB>> file has
> JB>> HB>JB>>
> JB>> HB>JB>> options SMP
> JB>> HB>JB>> device apic
> JB>> HB>JB>> options HZ=1000
> JB>> HB>JB>
> JB>> HB>JB>Ok, I can try to reproduce.
> JB>> HB>JB>
> JB>> HB>JB>> Device configuration finished.
> JB>> HB>JB>> Timecounter "TSC" frequency 1380009492 Hz quality -100
> JB>> HB>JB>> Timecounters cpuid = 0; apic id = 00
> JB>> HB>JB>> instruction pointer   = 0x8:0xc048995d
> JB>> HB>JB>> stack pointer = 0x10:0xc0821bf4
> JB>> HB>JB>> frame pointercpuid = 0; apic id = 00
> JB>> HB>JB>>
> JB>> HB>JB>> 0xc048995d is in critical_exit. It is the jmp after the popf from
> JB>> HB>JB>> cpu_critical_exit.
> JB>> HB>JB>
> JB>> HB>JB>This is where interrupts are re-enabled, so you are getting an interrupt.
> JB>> HB>JB>It might be helpful to figure what type of fault you are actually getting.
> JB>> HB>
> JB>> HB>tf_err is 0, tf_trapno is 30 (decimal).
> JB>>
> JB>> More information:
> JB>>
> JB>> I have replaced all the reserved vectors with individual ones, that set
> JB>> tf_err to the index (vector number). It appears the the vector number is
> JB>> 39 decimal. What does that mean?
> JB>
> JB>IRQ 7.
> JB>Can you post a verbose dmesg?  Also, can you try both with and without
> JB>ACPI?
> 
> Attached are both dmesgs.
> 
> More datapoints:
> 
> I had the parallel port (irq7) and the second sio disabled in the BIOS.
> After enabling both I now get a panic in lapic_handle_intr: Couldn't get
> vector from ISR! After fetching the relevant docs from intel I checked the
> registers of the apic pointed to by lapic. The interrupt taken is
> Xapic_irq1. isr1 is zero, but irr1 is 0x100 (that was without ACPI). How
> may that happen? As I understand ISR are the interrupts that have been
> delivered to the CPU so if it is interrupted a bit should be set, correct?

I figured out what is happenning I think.  You are getting a spurious
interrupt from the 8259A PIC (which comes in on IRQ 7).  The IRR register
lists pending interrupts still waiting to be serviced.  Try using
'options NO_MIXED_MODE' to stop using the 8259A's for the clock and see if
the spurious IRQ 7 interrupts go away.

> A question while reading the code: what does the global lapic variable
> refer to? As I understand every CPU has its local APIC. Does it point to
> one of those two? To which?

Every CPU can get to its APIC at the same physical address.  Thus, CPU A
can only get to its own local APIC, and not to any other CPUs.  The 'lapic'
variable has a virtual address mapped to the physical address of the local
APIC.

-- 

John Baldwin <[EMAIL PROTECTED]>  <><  http://www.FreeBSD.org/~jhb/
"Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: New interrupt stuff breaks ASUS 2 CPU system

2003-11-05 Thread Barney Wolff
Another data point:  I can't get my Asus A7M266-D to boot with the
new interrupt code at all, perhaps because I have an Adaptec 39160.
Whether acpi is on or off, whether it's in the kernel config or not,
booting always hangs right after "waiting 10 sec for scsi to settle"
and "0 scb's aborted".  I've also tried it with 0,1 or 2 of the ide
controllers enabled, with no change in result.  Sometimes I get a
"spurious interrupt" from ata1 message, sometimes not.  Kernel from
10/27 works fine.  Kernels from last couple of days fail.

dmesg, config available on request, if wanted.

-- 
Barney Wolff http://www.databus.com/bwresume.pdf
I'm available by contract or FT, in the NYC metro area or via the 'Net.
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


RE: New interrupt stuff breaks ASUS 2 CPU system

2003-11-05 Thread Harti Brandt
On Wed, 5 Nov 2003, Harti Brandt wrote:

HB>On Tue, 4 Nov 2003, John Baldwin wrote:
HB>
HB>JB>
HB>JB>On 04-Nov-2003 Harti Brandt wrote:
HB>JB>> On Tue, 4 Nov 2003, Harti Brandt wrote:
HB>JB>>
HB>JB>> HB>On Tue, 4 Nov 2003, John Baldwin wrote:
HB>JB>> HB>
HB>JB>> HB>JB>
HB>JB>> HB>JB>On 04-Nov-2003 Harti Brandt wrote:
HB>JB>> HB>JB>>
HB>JB>> HB>JB>> Hi,
HB>JB>> HB>JB>>
HB>JB>> HB>JB>> I have an ASUS system with 2 CPUs that I need to run at HZ=1. This
HB>JB>> HB>JB>> worked until yesterday, but with the new interrupt code it doesn't boot
HB>JB>> HB>JB>> anymore. It works for the standard HZ, but if I set HZ=1000 I get a 
double
HB>JB>> HB>JB>> fault. I suspect a race condition in the interrupt handling. My config
HB>JB>> HB>JB>> file has
HB>JB>> HB>JB>>
HB>JB>> HB>JB>> options SMP
HB>JB>> HB>JB>> device apic
HB>JB>> HB>JB>> options HZ=1000
HB>JB>> HB>JB>
HB>JB>> HB>JB>Ok, I can try to reproduce.
HB>JB>> HB>JB>
HB>JB>> HB>JB>> Device configuration finished.
HB>JB>> HB>JB>> Timecounter "TSC" frequency 1380009492 Hz quality -100
HB>JB>> HB>JB>> Timecounters cpuid = 0; apic id = 00
HB>JB>> HB>JB>> instruction pointer   = 0x8:0xc048995d
HB>JB>> HB>JB>> stack pointer = 0x10:0xc0821bf4
HB>JB>> HB>JB>> frame pointercpuid = 0; apic id = 00
HB>JB>> HB>JB>>
HB>JB>> HB>JB>> 0xc048995d is in critical_exit. It is the jmp after the popf from
HB>JB>> HB>JB>> cpu_critical_exit.
HB>JB>> HB>JB>
HB>JB>> HB>JB>This is where interrupts are re-enabled, so you are getting an interrupt.
HB>JB>> HB>JB>It might be helpful to figure what type of fault you are actually 
getting.
HB>JB>> HB>
HB>JB>> HB>tf_err is 0, tf_trapno is 30 (decimal).
HB>JB>>
HB>JB>> More information:
HB>JB>>
HB>JB>> I have replaced all the reserved vectors with individual ones, that set
HB>JB>> tf_err to the index (vector number). It appears the the vector number is
HB>JB>> 39 decimal. What does that mean?
HB>JB>
HB>JB>IRQ 7.
HB>JB>Can you post a verbose dmesg?  Also, can you try both with and without
HB>JB>ACPI?
HB>
HB>Attached are both dmesgs.
HB>
HB>More datapoints:
HB>
HB>I had the parallel port (irq7) and the second sio disabled in the BIOS.
HB>After enabling both I now get a panic in lapic_handle_intr: Couldn't get
HB>vector from ISR! After fetching the relevant docs from intel I checked the
HB>registers of the apic pointed to by lapic. The interrupt taken is
HB>Xapic_irq1. isr1 is zero, but irr1 is 0x100 (that was without ACPI). How
HB>may that happen? As I understand ISR are the interrupts that have been
HB>delivered to the CPU so if it is interrupted a bit should be set, correct?
HB>
HB>I then have replaced the panic by a printf() followed by a return. Now the
HB>system comes to live, but I get a couple of these warnings. When the
HB>system is idle everyting seems fine, but when I start my simulation
HB>application (which normally generates between 20k and 250k interrupts/sec
HB>depending on the MPSAFE setting of the ATM drivers) I get approx 1-2 of
HB>these messages per second (this is with HZ=1000).
HB>
HB>A question while reading the code: what does the global lapic variable
HB>refer to? As I understand every CPU has its local APIC. Does it point to
HB>one of those two? To which?

An additional point. In the above test where I got 1-2 message per second
I have now disabled a debugging printout in the ATM driver that gave 3-4
messages per second (from the interrupt handler). Now the 'Couldn't
get...' messages have disappeared. So this really looks like a race
somewhere. Is it possible that the bit in the ISR gets somehow cleared
between the point where the interrupt is handed to the processor but
before the Xapic_irq1 really runs and sees that bit? Perhaps from another
Xapic_irq1 instance or whatever?

harti
-- 
harti brandt,
http://www.fokus.fraunhofer.de/research/cc/cats/employees/hartmut.brandt/private
[EMAIL PROTECTED], [EMAIL PROTECTED]
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


RE: New interrupt stuff breaks ASUS 2 CPU system

2003-11-04 Thread John Baldwin

On 04-Nov-2003 Harti Brandt wrote:
> On Tue, 4 Nov 2003, Harti Brandt wrote:
> 
> HB>On Tue, 4 Nov 2003, John Baldwin wrote:
> HB>
> HB>JB>
> HB>JB>On 04-Nov-2003 Harti Brandt wrote:
> HB>JB>>
> HB>JB>> Hi,
> HB>JB>>
> HB>JB>> I have an ASUS system with 2 CPUs that I need to run at HZ=1. This
> HB>JB>> worked until yesterday, but with the new interrupt code it doesn't boot
> HB>JB>> anymore. It works for the standard HZ, but if I set HZ=1000 I get a double
> HB>JB>> fault. I suspect a race condition in the interrupt handling. My config
> HB>JB>> file has
> HB>JB>>
> HB>JB>> options SMP
> HB>JB>> device apic
> HB>JB>> options HZ=1000
> HB>JB>
> HB>JB>Ok, I can try to reproduce.
> HB>JB>
> HB>JB>> Device configuration finished.
> HB>JB>> Timecounter "TSC" frequency 1380009492 Hz quality -100
> HB>JB>> Timecounters cpuid = 0; apic id = 00
> HB>JB>> instruction pointer   = 0x8:0xc048995d
> HB>JB>> stack pointer = 0x10:0xc0821bf4
> HB>JB>> frame pointercpuid = 0; apic id = 00
> HB>JB>>
> HB>JB>> 0xc048995d is in critical_exit. It is the jmp after the popf from
> HB>JB>> cpu_critical_exit.
> HB>JB>
> HB>JB>This is where interrupts are re-enabled, so you are getting an interrupt.
> HB>JB>It might be helpful to figure what type of fault you are actually getting.
> HB>
> HB>tf_err is 0, tf_trapno is 30 (decimal).
> 
> More information:
> 
> I have replaced all the reserved vectors with individual ones, that set
> tf_err to the index (vector number). It appears the the vector number is
> 39 decimal. What does that mean?

IRQ 7.
Can you post a verbose dmesg?  Also, can you try both with and without
ACPI?

-- 

John Baldwin <[EMAIL PROTECTED]>  <><  http://www.FreeBSD.org/~jhb/
"Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: New interrupt stuff breaks ASUS 2 CPU system

2003-11-04 Thread Claus Guttesen
Hi.

> > I also have an ASUS motherboard with an Intel 875P
> > chipset.
> 
> Can you post a dmesg?  Note that if you want
> hyperthreading,
> you need to enable it in your BIOS.  The ACPI (and
> soon the
> MPTable) drivers will not use HT CPUs unless HT is
> enabled in
> the BIOS.  My test machines with HT used the 865
> chipset.

Upon boot the screen says that it's a dual Xeon with
HT.

I "downgraded" the server before I read this thread,
so it's running the previous days src. I guess that a
dmesg won't help from that.

regards
Claus


Yahoo! Mail (http://dk.mail.yahoo.com) - Gratis: 6 MB lagerplads, spamfilter og 
virusscan
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


RE: New interrupt stuff breaks ASUS 2 CPU system

2003-11-04 Thread John Baldwin

On 04-Nov-2003 Harti Brandt wrote:
> On Tue, 4 Nov 2003, Harti Brandt wrote:
> 
> HB>On Tue, 4 Nov 2003, John Baldwin wrote:
> HB>
> HB>JB>
> HB>JB>On 04-Nov-2003 Harti Brandt wrote:
> HB>JB>>
> HB>JB>> Hi,
> HB>JB>>
> HB>JB>> I have an ASUS system with 2 CPUs that I need to run at HZ=1. This
> HB>JB>> worked until yesterday, but with the new interrupt code it doesn't boot
> HB>JB>> anymore. It works for the standard HZ, but if I set HZ=1000 I get a double
> HB>JB>> fault. I suspect a race condition in the interrupt handling. My config
> HB>JB>> file has
> HB>JB>>
> HB>JB>> options SMP
> HB>JB>> device apic
> HB>JB>> options HZ=1000
> HB>JB>
> HB>JB>Ok, I can try to reproduce.
> HB>JB>
> HB>JB>> Device configuration finished.
> HB>JB>> Timecounter "TSC" frequency 1380009492 Hz quality -100
> HB>JB>> Timecounters cpuid = 0; apic id = 00
> HB>JB>> instruction pointer   = 0x8:0xc048995d
> HB>JB>> stack pointer = 0x10:0xc0821bf4
> HB>JB>> frame pointercpuid = 0; apic id = 00
> HB>JB>>
> HB>JB>> 0xc048995d is in critical_exit. It is the jmp after the popf from
> HB>JB>> cpu_critical_exit.
> HB>JB>
> HB>JB>This is where interrupts are re-enabled, so you are getting an interrupt.
> HB>JB>It might be helpful to figure what type of fault you are actually getting.
> HB>
> HB>tf_err is 0, tf_trapno is 30 (decimal).
> 
> Hmm, this seems to be the trapno that is set for all otherwise unused
> vectors, correct? There seems to be no info in the trapframe that shows
> me where this trap came from. How can I find this out?

You can't easily.  If you have an APIC, you can try looking at the
ISR registers.  You need to add some code to local_apic.c that dumps
the ISR contents and then call that from trap() prehaps.

-- 

John Baldwin <[EMAIL PROTECTED]>  <><  http://www.FreeBSD.org/~jhb/
"Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


RE: New interrupt stuff breaks ASUS 2 CPU system

2003-11-04 Thread Harti Brandt
On Tue, 4 Nov 2003, Harti Brandt wrote:

HB>On Tue, 4 Nov 2003, John Baldwin wrote:
HB>
HB>JB>
HB>JB>On 04-Nov-2003 Harti Brandt wrote:
HB>JB>>
HB>JB>> Hi,
HB>JB>>
HB>JB>> I have an ASUS system with 2 CPUs that I need to run at HZ=1. This
HB>JB>> worked until yesterday, but with the new interrupt code it doesn't boot
HB>JB>> anymore. It works for the standard HZ, but if I set HZ=1000 I get a double
HB>JB>> fault. I suspect a race condition in the interrupt handling. My config
HB>JB>> file has
HB>JB>>
HB>JB>> options SMP
HB>JB>> device apic
HB>JB>> options HZ=1000
HB>JB>
HB>JB>Ok, I can try to reproduce.
HB>JB>
HB>JB>> Device configuration finished.
HB>JB>> Timecounter "TSC" frequency 1380009492 Hz quality -100
HB>JB>> Timecounters cpuid = 0; apic id = 00
HB>JB>> instruction pointer   = 0x8:0xc048995d
HB>JB>> stack pointer = 0x10:0xc0821bf4
HB>JB>> frame pointercpuid = 0; apic id = 00
HB>JB>>
HB>JB>> 0xc048995d is in critical_exit. It is the jmp after the popf from
HB>JB>> cpu_critical_exit.
HB>JB>
HB>JB>This is where interrupts are re-enabled, so you are getting an interrupt.
HB>JB>It might be helpful to figure what type of fault you are actually getting.
HB>
HB>tf_err is 0, tf_trapno is 30 (decimal).

More information:

I have replaced all the reserved vectors with individual ones, that set
tf_err to the index (vector number). It appears the the vector number is
39 decimal. What does that mean?

harti
-- 
harti brandt,
http://www.fokus.fraunhofer.de/research/cc/cats/employees/hartmut.brandt/private
[EMAIL PROTECTED], [EMAIL PROTECTED]
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


RE: New interrupt stuff breaks ASUS 2 CPU system

2003-11-04 Thread Harti Brandt
On Tue, 4 Nov 2003, Harti Brandt wrote:

HB>On Tue, 4 Nov 2003, John Baldwin wrote:
HB>
HB>JB>
HB>JB>On 04-Nov-2003 Harti Brandt wrote:
HB>JB>>
HB>JB>> Hi,
HB>JB>>
HB>JB>> I have an ASUS system with 2 CPUs that I need to run at HZ=1. This
HB>JB>> worked until yesterday, but with the new interrupt code it doesn't boot
HB>JB>> anymore. It works for the standard HZ, but if I set HZ=1000 I get a double
HB>JB>> fault. I suspect a race condition in the interrupt handling. My config
HB>JB>> file has
HB>JB>>
HB>JB>> options SMP
HB>JB>> device apic
HB>JB>> options HZ=1000
HB>JB>
HB>JB>Ok, I can try to reproduce.
HB>JB>
HB>JB>> Device configuration finished.
HB>JB>> Timecounter "TSC" frequency 1380009492 Hz quality -100
HB>JB>> Timecounters cpuid = 0; apic id = 00
HB>JB>> instruction pointer   = 0x8:0xc048995d
HB>JB>> stack pointer = 0x10:0xc0821bf4
HB>JB>> frame pointercpuid = 0; apic id = 00
HB>JB>>
HB>JB>> 0xc048995d is in critical_exit. It is the jmp after the popf from
HB>JB>> cpu_critical_exit.
HB>JB>
HB>JB>This is where interrupts are re-enabled, so you are getting an interrupt.
HB>JB>It might be helpful to figure what type of fault you are actually getting.
HB>
HB>tf_err is 0, tf_trapno is 30 (decimal).

Hmm, this seems to be the trapno that is set for all otherwise unused
vectors, correct? There seems to be no info in the trapframe that shows
me where this trap came from. How can I find this out?

harti
-- 
harti brandt,
http://www.fokus.fraunhofer.de/research/cc/cats/employees/hartmut.brandt/private
[EMAIL PROTECTED], [EMAIL PROTECTED]
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


RE: New interrupt stuff breaks ASUS 2 CPU system

2003-11-04 Thread Harti Brandt
On Tue, 4 Nov 2003, John Baldwin wrote:

JB>
JB>On 04-Nov-2003 Harti Brandt wrote:
JB>>
JB>> Hi,
JB>>
JB>> I have an ASUS system with 2 CPUs that I need to run at HZ=1. This
JB>> worked until yesterday, but with the new interrupt code it doesn't boot
JB>> anymore. It works for the standard HZ, but if I set HZ=1000 I get a double
JB>> fault. I suspect a race condition in the interrupt handling. My config
JB>> file has
JB>>
JB>> options SMP
JB>> device apic
JB>> options HZ=1000
JB>
JB>Ok, I can try to reproduce.
JB>
JB>> Device configuration finished.
JB>> Timecounter "TSC" frequency 1380009492 Hz quality -100
JB>> Timecounters cpuid = 0; apic id = 00
JB>> instruction pointer   = 0x8:0xc048995d
JB>> stack pointer = 0x10:0xc0821bf4
JB>> frame pointercpuid = 0; apic id = 00
JB>>
JB>> 0xc048995d is in critical_exit. It is the jmp after the popf from
JB>> cpu_critical_exit.
JB>
JB>This is where interrupts are re-enabled, so you are getting an interrupt.
JB>It might be helpful to figure what type of fault you are actually getting.

tf_err is 0, tf_trapno is 30 (decimal).

harti
-- 
harti brandt,
http://www.fokus.fraunhofer.de/research/cc/cats/employees/hartmut.brandt/private
[EMAIL PROTECTED], [EMAIL PROTECTED]
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


RE: New interrupt stuff breaks ASUS 2 CPU system

2003-11-04 Thread John Baldwin

On 04-Nov-2003 Harti Brandt wrote:
> 
> Hi,
> 
> I have an ASUS system with 2 CPUs that I need to run at HZ=1. This
> worked until yesterday, but with the new interrupt code it doesn't boot
> anymore. It works for the standard HZ, but if I set HZ=1000 I get a double
> fault. I suspect a race condition in the interrupt handling. My config
> file has
> 
> options SMP
> device apic
> options HZ=1000

Ok, I can try to reproduce.

> Device configuration finished.
> Timecounter "TSC" frequency 1380009492 Hz quality -100
> Timecounters cpuid = 0; apic id = 00
> instruction pointer   = 0x8:0xc048995d
> stack pointer = 0x10:0xc0821bf4
> frame pointercpuid = 0; apic id = 00
> 
> 0xc048995d is in critical_exit. It is the jmp after the popf from
> cpu_critical_exit.

This is where interrupts are re-enabled, so you are getting an interrupt.
It might be helpful to figure what type of fault you are actually getting.

-- 

John Baldwin <[EMAIL PROTECTED]>  <><  http://www.FreeBSD.org/~jhb/
"Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: New interrupt stuff breaks ASUS 2 CPU system

2003-11-04 Thread John Baldwin

On 04-Nov-2003 Claus Guttesen wrote:
> Hi.
> 
>> I have an ASUS system with 2 CPUs that I need to run
>> at HZ=1. This
>> worked until yesterday, but with the new interrupt
>> code it doesn't boot
>> anymore. It works for the standard HZ, but if I set
>> HZ=1000 I get a double
> 
> Compiled a new kernel with source from Nov. 3'rd where
> SMP and APIC had to be enabled to use SMP. A make
> kernel would complete in 10 min's.
> 
> So I cvsupped to test the 'interrupt stuff' and
> recompiled. Upon boot it seemed that it only saw one
> of my two Xeons at 2.4 Ghz. Hypert. was enabled as
> default. So I reverted to the source the day before.
> 
> I also have an ASUS motherboard with an Intel 875P
> chipset.

Can you post a dmesg?  Note that if you want hyperthreading,
you need to enable it in your BIOS.  The ACPI (and soon the
MPTable) drivers will not use HT CPUs unless HT is enabled in
the BIOS.  My test machines with HT used the 865 chipset.

-- 

John Baldwin <[EMAIL PROTECTED]>  <><  http://www.FreeBSD.org/~jhb/
"Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: New interrupt stuff breaks ASUS 2 CPU system

2003-11-04 Thread Claus Guttesen
Hi.

> I have an ASUS system with 2 CPUs that I need to run
> at HZ=1. This
> worked until yesterday, but with the new interrupt
> code it doesn't boot
> anymore. It works for the standard HZ, but if I set
> HZ=1000 I get a double

Compiled a new kernel with source from Nov. 3'rd where
SMP and APIC had to be enabled to use SMP. A make
kernel would complete in 10 min's.

So I cvsupped to test the 'interrupt stuff' and
recompiled. Upon boot it seemed that it only saw one
of my two Xeons at 2.4 Ghz. Hypert. was enabled as
default. So I reverted to the source the day before.

I also have an ASUS motherboard with an Intel 875P
chipset.

regards
Claus



Yahoo! Mail (http://dk.mail.yahoo.com) - Gratis: 6 MB lagerplads, spamfilter og 
virusscan
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


New interrupt stuff breaks ASUS 2 CPU system

2003-11-04 Thread Harti Brandt

Hi,

I have an ASUS system with 2 CPUs that I need to run at HZ=1. This
worked until yesterday, but with the new interrupt code it doesn't boot
anymore. It works for the standard HZ, but if I set HZ=1000 I get a double
fault. I suspect a race condition in the interrupt handling. My config
file has

options SMP
device apic
options HZ=1000

I have commented out acpi, but that doesn't change anything. A verbose
boot looks like:

OK boot -v
SMAP type=01 base= len=0009f800
SMAP type=02 base=0009f800 len=0800
SMAP type=02 base=000f len=0001
SMAP type=01 base=0010 len=1feec000
SMAP type=03 base=1ffec000 len=3000
SMAP type=02 base=1ffef000 len=0001
SMAP type=04 base=1000 len=1000
SMAP type=02 base=fec0 len=1000
SMAP type=02 base=fee0 len=1000
SMAP type=02 base= len=0001
Copyright (c) 1992-2003 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD 5.1-CURRENT #29: Tue Nov  4 13:50:23 CET 2003
[EMAIL PROTECTED]:/opt/obj/usr/src/sys/MARIPOSA
Preloaded elf kernel "/boot/kernel/kernel" at 0xc069b000.
Preloaded elf module "/boot/kernel/random.ko" at 0xc069b278.
Calibrating clock(s) ... i8254 clock: 1193132 Hz
CLK_USE_I8254_CALIBRATION not specified - using default frequency
Timecounter "i8254" frequency 1193182 Hz quality 0
Calibrating TSC clock ... TSC clock: 1380009492 Hz
CPU: AMD Athlon(TM) MP 1800+ (1380.01-MHz 686-class CPU)
  Origin = "AuthenticAMD"  Id = 0x662  Stepping = 2
  
Features=0x383fbff
  AMD Features=0xc048
Data TLB: 32 entries, fully associative
Instruction TLB: 16 entries, fully associative
L1 data cache: 64 kbytes, 64 bytes/line, 1 lines/tag, 2-way associative
L1 instruction cache: 64 kbytes, 64 bytes/line, 1 lines/tag, 2-way associative
L2 internal cache: 256 kbytes, 64 bytes/line, 1 lines/tag, 8-way associative
real memory  = 536788992 (511 MB)
Physical memory chunk(s):
0x1000 - 0x0009efff, 647168 bytes (158 pages)
0x0010 - 0x003f, 3145728 bytes (768 pages)
0x00829000 - 0x1f6c5fff, 518639616 bytes (126621 pages)
avail memory = 516005888 (492 MB)
MPTable: 
APIC ID: physical 0, logical 0:0
APIC ID: physical 1, logical 0:1
FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
 cpu0 (BSP): APIC ID:  0
 cpu1 (AP): APIC ID:  1
bios32: Found BIOS32 Service Directory header at 0xc00f2560
bios32: Entry = 0xf1d20 (c00f1d20)  Rev = 0  Len = 1
pcibios: PCI BIOS entry at 0xf+0x1f20
pnpbios: Found PnP BIOS data at 0xc00fc5e0
pnpbios: Entry = f:c610  Rev = 1.0
pnpbios: OEM ID cd041
Other BIOS signatures found:
ioapic0: Assuming intbase of 0
ioapic0: intpin 0 -> ExtINT
ioapic0: intpin 1 -> irq 1
ioapic0: intpin 2 -> irq 2
ioapic0: intpin 3 -> irq 3
ioapic0: intpin 4 -> irq 4
ioapic0: intpin 5 -> irq 5
ioapic0: intpin 6 -> irq 6
ioapic0: intpin 7 -> irq 7
ioapic0: intpin 8 -> irq 8
ioapic0: intpin 9 -> irq 9
ioapic0: intpin 10 -> irq 10
ioapic0: intpin 11 -> irq 11
ioapic0: intpin 12 -> irq 12
ioapic0: intpin 13 -> irq 13
ioapic0: intpin 14 -> irq 14
ioapic0: intpin 15 -> irq 15
ioapic0: intpin 16 -> irq 16
ioapic0: intpin 17 -> irq 17
ioapic0: intpin 18 -> irq 18
ioapic0: intpin 19 -> irq 19
ioapic0: intpin 20 -> irq 20
ioapic0: intpin 21 -> irq 21
ioapic0: intpin 22 -> irq 22
ioapic0: intpin 23 -> irq 23
ioapic0: intpin 16 trigger: level
ioapic0: intpin 16 polarity: active-lo
ioapic0: intpin 16 trigger: level
ioapic0: intpin 16 polarity: active-lo
ioapic0: intpin 19 trigger: level
ioapic0: intpin 19 polarity: active-lo
ioapic0: intpin 18 trigger: level
ioapic0: intpin 18 polarity: active-lo
ioapic0: intpin 17 trigger: level
ioapic0: intpin 17 polarity: active-lo
ioapic0: intpin 19 trigger: level
ioapic0: intpin 19 polarity: active-lo
ioapic0: intpin 1 trigger: edge
ioapic0: intpin 1 polarity: active-hi
ioapic0: Routing IRQ 0 -> intpin 2
ioapic0: intpin 2 trigger: edge
ioapic0: intpin 2 polarity: active-hi
ioapic0: intpin 4 trigger: edge
ioapic0: intpin 4 polarity: active-hi
ioapic0: intpin 5 trigger: edge
ioapic0: intpin 5 polarity: active-hi
ioapic0: intpin 6 trigger: edge
ioapic0: intpin 6 polarity: active-hi
ioapic0: intpin 7 trigger: edge
ioapic0: intpin 7 polarity: active-hi
ioapic0: intpin 8 trigger: edge
ioapic0: intpin 8 polarity: active-hi
ioapic0: intpin 9 trigger: edge
ioapic0: intpin 9 polarity: active-hi
ioapic0: intpin 13 trigger: edge
ioapic0: intpin 13 polarity: active-hi
ioapic0: intpin 14 trigger: edge
ioapic0: intpin 14 polarity: active-hi
ioapic0: intpin 15 trigger: edge
ioapic0: intpin 15 polarity: active-hi
lapic: Routing ExtINT -> LINT0
lapic: LINT0 trigger: edge
lapic: LINT0 polarity: active-hi
lapic: Routing NMI -> LINT1
lapic: LINT1 trigger: edge
lapic: LINT1 polarit