Re: what causes Machine Check exception? revisited (2.2.18)

2001-05-08 Thread Mike Fedyk
On Mon, May 07, 2001 at 11:57:17AM +0100, Alan Cox wrote: > Generally it indicates a CPU problem but I've see it caused by overclocking > and poorly fitted heatsinks I've been able to trigger a Machine check error on PPC when trying to boot directly from OF with a COFF kernel. The system has

Re: what causes Machine Check exception? revisited (2.2.18)

2001-05-08 Thread Mike Fedyk
On Mon, May 07, 2001 at 11:57:17AM +0100, Alan Cox wrote: Generally it indicates a CPU problem but I've see it caused by overclocking and poorly fitted heatsinks I've been able to trigger a Machine check error on PPC when trying to boot directly from OF with a COFF kernel. The system has

RE: what causes Machine Check exception? revisited (2.2.18)

2001-05-07 Thread Simon Richter
On Mon, 7 May 2001, Dan Hollis wrote: > > Erm, it was bad RAM everytime it happened to me. On standard PCs, you > > don't see those because you don't have ECC and the error is simply not > > detected. > So a 440bx motherboard with ECC ram is a non-standard PC? I bet the board doesn't force you

RE: what causes Machine Check exception? revisited (2.2.18)

2001-05-07 Thread nick
Yep, totally. I've worked on hundreds of systems and less than 20 of the workstations or PCs have been useing ECC. Most servers do, but not even all of them. Nick On Mon, 7 May 2001, Dan Hollis wrote: > On Mon, 7 May 2001, Simon Richter wrote: > > On Mon, 7 May 2001, Bene, Martin

RE: what causes Machine Check exception? revisited (2.2.18)

2001-05-07 Thread Dan Hollis
On Mon, 7 May 2001, Simon Richter wrote: > On Mon, 7 May 2001, Bene, Martin wrote: > > Definitely not caused by: > > Bad Rams, mb-chipset. > Erm, it was bad RAM everytime it happened to me. On standard PCs, you > don't see those because you don't have ECC and the error is simply not >

RE: what causes Machine Check exception? revisited (2.2.18)

2001-05-07 Thread Simon Richter
On Mon, 7 May 2001, Bene, Martin wrote: [MCE caused by bad RAM] > I don't think there is a way a machine check exception can be triggered by > software - which it would have to be in order to be caused by bad RAMs. A MCE is triggered by an ECC error - no software involved. A good trap handler

RE: what causes Machine Check exception? revisited (2.2.18)

2001-05-07 Thread Ricardo Galli
>> Definitely not caused by: >> Bad Rams, mb-chipset. > > Erm, it was bad RAM everytime it happened to me. On standard PCs, you > don't see those because you don't have ECC and the error is simply not > detected. I did have the same problem with an SMP Intel 440LX which run without any problem

Re: what causes Machine Check exception? revisited (2.2.18)

2001-05-07 Thread Alan Cox
> You get SIG11 errors when running programs(kernel compile seems to be agood > example), you get crashing processes, you get all sorts of weird funnies but > you really shouldn't get machine check exceptions. > > I don't think there is a way a machine check exception can be triggered by >

RE: what causes Machine Check exception? revisited (2.2.18)

2001-05-07 Thread Bene, Martin
Hi Simon, > On Mon, 7 May 2001, Bene, Martin wrote: > > > Definitely not caused by: > > Bad Rams, mb-chipset. > > Erm, it was bad RAM everytime it happened to me. On standard PCs, you > don't see those because you don't have ECC and the error is simply not > detected. Strange -

RE: what causes Machine Check exception? revisited (2.2.18)

2001-05-07 Thread Simon Richter
On Mon, 7 May 2001, Bene, Martin wrote: > Definitely not caused by: > Bad Rams, mb-chipset. Erm, it was bad RAM everytime it happened to me. On standard PCs, you don't see those because you don't have ECC and the error is simply not detected. Simon -- GPG public key available from

Re: what causes Machine Check exception? revisited (2.2.18)

2001-05-07 Thread Alan Cox
> After searching the archives of the list I found some similar reports > from September and December 2000 but as far as I understood the cause of > the error was blamed on the CPU. Is this the most probable case? A machine check (trap 18) is signalled by the processor when it thinks it is in

RE: what causes Machine Check exception? revisited (2.2.18)

2001-05-07 Thread Bene, Martin
Hi Juhan, > After searching the archives of the list I found some similar reports > from September and December 2000 but as far as I understood > the cause of > the error was blamed on the CPU. Is this the most probable case? > > Best regards, > > Juhan Ernits > > -- /var/log/kern.log

RE: what causes Machine Check exception? revisited (2.2.18)

2001-05-07 Thread Bene, Martin
Hi Juhan, After searching the archives of the list I found some similar reports from September and December 2000 but as far as I understood the cause of the error was blamed on the CPU. Is this the most probable case? Best regards, Juhan Ernits -- /var/log/kern.log May 6

Re: what causes Machine Check exception? revisited (2.2.18)

2001-05-07 Thread Alan Cox
After searching the archives of the list I found some similar reports from September and December 2000 but as far as I understood the cause of the error was blamed on the CPU. Is this the most probable case? A machine check (trap 18) is signalled by the processor when it thinks it is in an

RE: what causes Machine Check exception? revisited (2.2.18)

2001-05-07 Thread Simon Richter
On Mon, 7 May 2001, Bene, Martin wrote: Definitely not caused by: Bad Rams, mb-chipset. Erm, it was bad RAM everytime it happened to me. On standard PCs, you don't see those because you don't have ECC and the error is simply not detected. Simon -- GPG public key available from

RE: what causes Machine Check exception? revisited (2.2.18)

2001-05-07 Thread Bene, Martin
Hi Simon, On Mon, 7 May 2001, Bene, Martin wrote: Definitely not caused by: Bad Rams, mb-chipset. Erm, it was bad RAM everytime it happened to me. On standard PCs, you don't see those because you don't have ECC and the error is simply not detected. Strange - definitely, strange.

Re: what causes Machine Check exception? revisited (2.2.18)

2001-05-07 Thread Alan Cox
You get SIG11 errors when running programs(kernel compile seems to be agood example), you get crashing processes, you get all sorts of weird funnies but you really shouldn't get machine check exceptions. I don't think there is a way a machine check exception can be triggered by software -

RE: what causes Machine Check exception? revisited (2.2.18)

2001-05-07 Thread Ricardo Galli
Definitely not caused by: Bad Rams, mb-chipset. Erm, it was bad RAM everytime it happened to me. On standard PCs, you don't see those because you don't have ECC and the error is simply not detected. I did have the same problem with an SMP Intel 440LX which run without any problem since

RE: what causes Machine Check exception? revisited (2.2.18)

2001-05-07 Thread Simon Richter
On Mon, 7 May 2001, Bene, Martin wrote: [MCE caused by bad RAM] I don't think there is a way a machine check exception can be triggered by software - which it would have to be in order to be caused by bad RAMs. A MCE is triggered by an ECC error - no software involved. A good trap handler

RE: what causes Machine Check exception? revisited (2.2.18)

2001-05-07 Thread Dan Hollis
On Mon, 7 May 2001, Simon Richter wrote: On Mon, 7 May 2001, Bene, Martin wrote: Definitely not caused by: Bad Rams, mb-chipset. Erm, it was bad RAM everytime it happened to me. On standard PCs, you don't see those because you don't have ECC and the error is simply not detected. So a

RE: what causes Machine Check exception? revisited (2.2.18)

2001-05-07 Thread nick
Yep, totally. I've worked on hundreds of systems and less than 20 of the workstations or PCs have been useing ECC. Most servers do, but not even all of them. Nick On Mon, 7 May 2001, Dan Hollis wrote: On Mon, 7 May 2001, Simon Richter wrote: On Mon, 7 May 2001, Bene, Martin wrote:

RE: what causes Machine Check exception? revisited (2.2.18)

2001-05-07 Thread Simon Richter
On Mon, 7 May 2001, Dan Hollis wrote: Erm, it was bad RAM everytime it happened to me. On standard PCs, you don't see those because you don't have ECC and the error is simply not detected. So a 440bx motherboard with ECC ram is a non-standard PC? I bet the board doesn't force you to use