On Mon, 2006-07-17 at 21:55 +0400, Vladimir V. Saveliev wrote:
> Hello
> 
> On Mon, 2006-07-17 at 10:53 +0200, Francisco Javier Cabello wrote:
> > Hello Vladimir,
> > > such corruptions used to be considered as hardware bugs. Memory failure,
> > > for instance. Did you ever run memtest on your systems?
> > 
> > Yes, We have run memtest in our system. It's very seldom to find a system 
> > with 
> > a hardware memory problem running. When we find a memory problem the kernel 
> > doesn't boot. I am going to pass memtest in some of the system with 
> > reiserfs 
> > corruption problem.
> > 

This is not true. There are certain memory issues that can still allow
the system to boot and appear to run ok. I had a system that didn't show
a memory error until the 4th pass on memtest. I just happened to let it
run over the weekend. I have seen other issues with my larger systems
that have 64GB of ram. To where memtest after a week didn't detect
anything but the kernel mcelog reported weird ECC memory issues. I
replaced several DIMM's and the issue went away. But who knows what
could of occured had I not replaced the memory.

Brad Dameron
SeaTab Software
www.seatab.com

Reply via email to