Hrm, it doesn't look to be a heat issue. I opened the case and took a
look inside when i saw the last message,and it looks perfectly cool and
happy.

I do notice however, that it's only happening when i hammer the raid
array, which is on a pci promise controler.

On Tue, 6 Dec 2005, Deedra Waters wrote:

> Date: Tue, 6 Dec 2005 22:04:50 -0600 (CST)
> From: Deedra Waters <[EMAIL PROTECTED]>
> Reply-To: [email protected]
> To: [email protected]
> Subject: Re: [gentoo-amd64] mce log errors
>
> Is there a way to test that fact? I've tried to work with lm_sensors,
> but the readings for that are way way off. So, considering  lm_sensors
> isuseless is there another way to tell if overheating is the problem?
>
> The case itself has a lot of fans, but it's also got 5 harddrives in it.
> On Tue, 6 Dec 2005, Daniel Gryniewicz wrote:
>
> > Date: Tue, 06 Dec 2005 18:39:48 -0500
> > From: Daniel Gryniewicz <[EMAIL PROTECTED]>
> > Reply-To: [email protected]
> > To: [email protected]
> > Subject: Re: [gentoo-amd64] mce log errors
> >
> > On Tue, 2005-12-06 at 14:56 -0600, Deedra Waters wrote:
> > > All,
> > >
> > > I'm getting a lot of these, but it only seems to happen when i put the
> > > machine under a lot of stress, and even then it's not always happening.
> > > This machine is a duel opteron 242, the board is an asus k8, and with
> > > the latest bios update, the machine has  no real problems at all.
> > >
> > > MCE 1
> > > CPU 0 4 northbridge TSC 8f1a7b270b6f
> > > ADDR 75c3320
> > >   Northbridge ECC error
> > >   ECC syndrome = 62
> > >        bit32 = err cpu0
> > >        bit46 = corrected ecc error
> > >   bus error 'local node origin, request didn't time out
> > >       generic read mem transaction
> > >       memory access, level generic'
> > > STATUS 9431400100000813 MCGSTATUS 0
> > > MCE 2
> > > CPU 0 2 bus unit TSC 8f8ad2325db7
> > >   L2 cache ECC error
> > >   Bus or cache array error
> > >        bit46 = corrected ecc error
> > >        bit62 = error overflow (multiple errors)
> > >   bus error 'local node origin, request didn't time out
> > >       prefetch mem transaction
> > >       memory access, level generic'
> > > STATUS d000400000000863 MCGSTATUS 0
> >
> > CPU cache is getting ECC errors.  Smells like overheating.
> >
> > Daniel
> >
>
>

-- 
Deedra Waters - Gentoo developer relations, accessibility and infrastructure -
[EMAIL PROTECTED]
Gentoo linux: http://www.gentoo.org

-- 
[email protected] mailing list

Reply via email to