Re: [gentoo-user] MCE in kernel

2007-09-04 Thread Don Jerman
On 9/3/07, Alan E. Davis [EMAIL PROTECTED] wrote:
 Thank you.  I have solved the problem for now, but live in fear that there
 is something untoward going in on my hardware.

Quite possible.  It can also be caused by misconfiguring kernel
drivers.  I recently (accidently) selected the ATI agpart driver
instead of the Intel driver.  Most drivers correctly detect when their
corresponding device isn't present, but this one gamely tried to
manage the AGP bridge and fouled up memory whenever X started...

So you may want to review your kernel config and make sure you have
all the devices you're attempting to use.
-- 
[EMAIL PROTECTED] mailing list



Re: [gentoo-user] MCE in kernel

2007-09-04 Thread Alan E. Davis
Thank you.  I noticed that when I ran make oldconfig on a new kernel, the
configs were not what I'd expected.  The wrong CPU type was configured.

Alan

On 9/5/07, Don Jerman [EMAIL PROTECTED] wrote:

 On 9/3/07, Alan E. Davis [EMAIL PROTECTED] wrote:
  Thank you.  I have solved the problem for now, but live in fear that
 there
  is something untoward going in on my hardware.
 
 Quite possible.  It can also be caused by misconfiguring kernel
 drivers.  I recently (accidently) selected the ATI agpart driver
 instead of the Intel driver.  Most drivers correctly detect when their
 corresponding device isn't present, but this one gamely tried to
 manage the AGP bridge and fouled up memory whenever X started...

 So you may want to review your kernel config and make sure you have
 all the devices you're attempting to use.
 --
 [EMAIL PROTECTED] mailing list




-- 
Alan Davis, Kagman High School, Saipan  [EMAIL PROTECTED]

An inviscid theory of flow renders the screw useless, but the need for one
non-existent.
 ---Lord Raleigh (aka John William Strutt), or else his son,


Re: [gentoo-user] MCE in kernel

2007-09-03 Thread Dan Farrell
On Sat, 1 Sep 2007 11:08:27 +1000
Alan E. Davis [EMAIL PROTECTED] wrote:

 I have been unable to boot into my gentoo system due to a Machine
 Check Exception.  This is an AMD 64 system.  MCE for AMD is enabled
 in the kernel (2.6.21 gentoo-sources).
 
 I am unable to boot in to turn off MCE checking.  

did you know you can disable this at boot time?  Check it out:

| $ grep mce /usr/src/linux/Documentation/kernel-parameters.txt
|   mce [IA-32] Machine Check Exception
|   nomce   [IA-32] Machine Check Exception  

just add 'nomce' to your kernel boot line in grub and you should be able
to boot with MCE turned of to reconfigure.  
-- Dan
-- 
[EMAIL PROTECTED] mailing list



Re: [gentoo-user] MCE in kernel

2007-09-03 Thread Alan E. Davis
Thank you.  I have solved the problem for now, but live in fear that there
is something untoward going in on my hardware.

Earlier on, this was intermittent.  I also wonder whether a register was set
or a cmos flag, because after I booted the Ubuntu partition, the machine did
boot with no complaint.  It hadn't been going on long, though.  Well, I
finally was able to boot using an earlier kernel with no MCE flag set, then
recompile a newer kernel without it.

I think your solution is the better one, though.

I did follow the instructions of the boot messages and installed an mce log
translation utility, but I didn't make sense of what to do with it.

Thank you again,

Alan

On 9/4/07, Dan Farrell [EMAIL PROTECTED] wrote:

 On Sat, 1 Sep 2007 11:08:27 +1000
 Alan E. Davis [EMAIL PROTECTED] wrote:

  I have been unable to boot into my gentoo system due to a Machine
  Check Exception.  This is an AMD 64 system.  MCE for AMD is enabled
  in the kernel (2.6.21 gentoo-sources).
 
  I am unable to boot in to turn off MCE checking.

 did you know you can disable this at boot time?  Check it out:

 | $ grep mce /usr/src/linux/Documentation/kernel-parameters.txt
 |   mce [IA-32] Machine Check Exception
 |   nomce   [IA-32] Machine Check Exception

 just add 'nomce' to your kernel boot line in grub and you should be able
 to boot with MCE turned of to reconfigure.
 -- Dan
 --
 [EMAIL PROTECTED] mailing list




-- 
Alan Davis, Kagman High School, Saipan  [EMAIL PROTECTED]

An inviscid theory of flow renders the screw useless, but the need for one
non-existent.
 ---Lord Raleigh (aka John William Strutt), or else his son,


Re: [gentoo-user] MCE in kernel

2007-09-03 Thread Dan Farrell
On Tue, 4 Sep 2007 06:51:38 +1000
Alan E. Davis [EMAIL PROTECTED] wrote:

 I think your solution is the better one, though.
 
 I did follow the instructions of the boot messages and installed an
 mce log translation utility, but I didn't make sense of what to do
 with it.

The thing is, you are only masking symptoms.  There may be something
wrong, and perhaps you could save a lot of work later by fixing a
problem before it turns catastrophic.  

from http://en.wikipedia.org/wiki/Machine_Check_Exception

A Machine Check Exception, also called MCE, is a computer hardware
error which occurs when a computer's central processing unit detects an
unrecoverable hardware problem.

Normal causes for MCE errors are overheating and/or incorrect hardware
installation. Overheating can cause electrons to become more animated
and thus escape from the silicon tracks, resulting in corrupted data.
Some specific manually induced causes could be:

Overclocking (naturally increases heat output)

Poorly fitted heatsink/computer fans (the same problem can happen with
excessive dust in the CPU fan)

Computer software can also cause errors in this way (normally by
corrupting data they are reading or writing). For example:

-Software performing read or write operations to non-existent memory
regions which leads to confusion for the processor and/or the system
bus.

3rd party programs

mcelog
mcelog is a Linux program to decode MCE's on x86-64 processors

-- 
[EMAIL PROTECTED] mailing list



Re: [gentoo-user] MCE in kernel

2007-09-03 Thread Alan E. Davis
Thank you Dan:

I'll look into this.  Time to tear the old box apart again.

Thank you again.

Alan

On 9/4/07, Dan Farrell [EMAIL PROTECTED] wrote:

 On Tue, 4 Sep 2007 06:51:38 +1000
 Alan E. Davis [EMAIL PROTECTED] wrote:

  I think your solution is the better one, though.
 
  I did follow the instructions of the boot messages and installed an
  mce log translation utility, but I didn't make sense of what to do
  with it.

 The thing is, you are only masking symptoms.  There may be something
 wrong, and perhaps you could save a lot of work later by fixing a
 problem before it turns catastrophic.

 from http://en.wikipedia.org/wiki/Machine_Check_Exception

 A Machine Check Exception, also called MCE, is a computer hardware
 error which occurs when a computer's central processing unit detects an
 unrecoverable hardware problem.

 Normal causes for MCE errors are overheating and/or incorrect hardware
 installation. Overheating can cause electrons to become more animated
 and thus escape from the silicon tracks, resulting in corrupted data.
 Some specific manually induced causes could be:

 Overclocking (naturally increases heat output)

 Poorly fitted heatsink/computer fans (the same problem can happen with
 excessive dust in the CPU fan)

 Computer software can also cause errors in this way (normally by
 corrupting data they are reading or writing). For example:

 -Software performing read or write operations to non-existent memory
 regions which leads to confusion for the processor and/or the system
 bus.

 3rd party programs

 mcelog
 mcelog is a Linux program to decode MCE's on x86-64 processors

 --
 [EMAIL PROTECTED] mailing list




-- 
Alan Davis, Kagman High School, Saipan  [EMAIL PROTECTED]

An inviscid theory of flow renders the screw useless, but the need for one
non-existent.
 ---Lord Raleigh (aka John William Strutt), or else his son,


[gentoo-user] MCE in kernel

2007-08-31 Thread Alan E. Davis
I have been unable to boot into my gentoo system due to a Machine Check
Exception.  This is an AMD 64 system.  MCE for AMD is enabled in the kernel
(2.6.21 gentoo-sources).

I am unable to boot in to turn off MCE checking.  I was able to log in by
single user mode.  The MCE happens at the end of the loading of default
scripts, at least this is what I am seeing on the screen: xdm has been
loaded.

The problem is, I have been installing ubuntu on another partition, and it
boots fine.

If I have it right, I can download a gentoo live install disk and compile a
new kernel.  Is there a howto on this specific problem?

Thank you,

Alan Davis

-- 
Alan Davis, Kagman High School, Saipan  [EMAIL PROTECTED]

An inviscid theory of flow renders the screw useless, but the need for one
non-existent.
 ---Lord Raleigh (aka John William Strutt), or else his son,