On Sat, Nov 20, 2004 at 09:31:12AM +0200, [EMAIL PROTECTED] wrote:
> Hi,
> 
> On 11/20/2004 03:14 AM, guy keren wrote:
> > 1. if it's every few month - does it happen with the exact same kernel
> >    code and configuration? did i upgrade my kernel several times between
> >    incidents?
> 
> Same kernel (Debian Woody latest 686 SMP), same hardware.

This kernel (2.4.18-xx) is incredibly ancient. Not sure how feasible
it is for you to try a modern kernel on the same setup, but if you
can, it's a lot easier than chasing this bug down. 

> > 4. does it always crash with the exact same oops message (i.e. apache
> >    causing a crash during 'fput' invoked from a 'close' system call?
> 
> No. Last time it was mysqld, with a garbled call path.

Can you run memtest on the box? being a production server, probably
not...

> All occurances I saw relate to either hardware problems or kernel
bugs.

That's a tautology :-)

> It would be somewhat surprising if this is a hardware problem, since
> this server is spending most of its time in userspace and there are no
> userspace crashes (note that a localized fault in a chunk of physical
> memory used only by the kernel, such as very low memory, is unlikely due
> to the use of ECC memory). Conversely, the SMP configuration makes it
> slightly less unthinkable that a kernel bug is involved.

userspace might be dying and no one notices... that's one of the nice
things about kernel oopses, it's hard to ignore them ;-)

Cheers, 
Muli
-- 
Muli Ben-Yehuda
http://www.mulix.org | http://mulix.livejournal.com/

Attachment: signature.asc
Description: Digital signature

Reply via email to