James,
I have a similar system, with two 366 CPUs, large heatsink/fans on the
CPUs, a fan on the BX chip, and 256 MB of ECC memory (single module).
This system will not run overnight at 550 MHZ with LINUX or NT, and I
could not build a kernel without seeing the "Received signal 11,
exiting" errors. This system was built by a vendor who claimed that
they tested it, and guaranteed it would run at 550MHZ.
Setting the FSB of 92 MB, the system is very stable, and I have had it
up for days with 2 copies of SETI running, and my MP3 player always on
(using a SB Live PCI card). I am also running with KDE and the X-screen
saver turned on. The graphic card is a ELSA Erazor X (NVidia Gforce
256), and I am running Mandrake 7.0, with the 2.2.14 kernel that I
patched for the Ultra/66 chipset.
My CPU temps run between 36-41C, depending on the room temperature (but
to do this, I have my case off, and an external fan blowing into the box
until I have time to add more fans to the box). I have not seen the
APIC errors.
You may want to try slowing things down a bit.
Eric
James Manning wrote:
>
> 1) Are all thermal grease compounds the same? Is there any real
> difference among them in terms of thermal performance (thinking about
> picking one up for the BX). Is it worth greasing the celerons *and*
> the BX? The BX currently gets a little over ambient temp, not cold
>
> 2) When 366's run at 550, Q3A dies with "Received signal 11, exiting"
> and I'm trying to figure out how valid it is to blame the non-ECC
> PC133 256MB SDRAM in the machine.
>
> 3) On a freeze after 2 hours up @ 550, the bios had the temps at
> 33 and 31 C for the 2 CPU's and 37 C for system. Since it doesn't
> look like the temps rise during running, are the lock-ups still
> heat-related (possibly)?
>
> 4) ISA soundcard... is it really that big a deal (in terms of lockups)?
> Will taking this lone ISA card out of the system help uptime?
>
> 5) Why do I get APIC errors even booting with "noapic" option?
> kernel: APIC error interrupt on CPU#0, should never happen.
> kernel: ... APIC ESR0: 00000004
> kernel: ... APIC ESR1: 00000006
> kernel: ... bit 1: APIC Receive CS Error (hw problem).
> kernel: ... bit 2: APIC Send Accept Error.
>
> (these are diff ones, just note how many of them bunch up in time)
>
> Feb 3 13:04:09 ns1 kernel: APIC error interrupt on CPU#0, should never happen.
> Feb 3 13:04:09 ns1 kernel: ... APIC ESR0: 00000002
> Feb 3 13:04:09 ns1 kernel: ... APIC ESR1: 00000002
> Feb 3 13:04:09 ns1 kernel: ... bit 1: APIC Receive CS Error (hw problem).
> Feb 3 13:40:29 ns1 kernel: APIC error interrupt on CPU#1, should never happen.
> Feb 3 13:40:29 ns1 kernel: ... APIC ESR0: 00000000
> Feb 3 13:40:29 ns1 kernel: ... APIC ESR1: 00000008
> Feb 3 13:40:29 ns1 kernel: ... bit 3: APIC Receive Accept Error.
> Feb 3 13:40:29 ns1 kernel: APIC error interrupt on CPU#0, should never happen.
> Feb 3 13:40:29 ns1 kernel: ... APIC ESR0: 00000002
> Feb 3 13:40:29 ns1 kernel: ... APIC ESR1: 00000006
> Feb 3 13:40:29 ns1 kernel: ... bit 1: APIC Receive CS Error (hw problem).
> Feb 3 13:40:29 ns1 kernel: ... bit 2: APIC Send Accept Error.
> Feb 3 13:40:37 ns1 kernel: APIC error interrupt on CPU#1, should never happen.
> Feb 3 13:40:37 ns1 kernel: ... APIC ESR0: 00000008
> Feb 3 13:40:37 ns1 kernel: ... APIC ESR1: 00000008
> Feb 3 13:40:37 ns1 kernel: ... bit 3: APIC Receive Accept Error.
> Feb 3 13:40:37 ns1 kernel: APIC error interrupt on CPU#0, should never happen.
> Feb 3 13:40:37 ns1 kernel: ... APIC ESR0: 00000004
> Feb 3 13:40:37 ns1 kernel: ... APIC ESR1: 00000004
> Feb 3 13:40:37 ns1 kernel: ... bit 2: APIC Send Accept Error.
>
> Note that I'm not stressing the machine *at all*. Load average
> around 0.08, no kernel builds, etc... the machine locked up after
> 3 hours of nothing of load (not even Q3A, as I can't get the DRM/DRI
> stuff working right under 2.3.42 yet)
>
> James
> --
> =- To unsubscribe, email [EMAIL PROTECTED] with the -=
> =- body of "unsubscribe linux-abit". -=
--
Eric Eastman
[EMAIL PROTECTED]
Silicon Graphics Federal, Inc.
--
=- To unsubscribe, email [EMAIL PROTECTED] with the -=
=- body of "unsubscribe linux-abit". -=