On Thursday, I pulled the 4 128Mb DIMMs out of Phantom and replaced them with a single, known working, 128Mb DIMM. We haven't had a kernel Oops or a crash in more than a day, which is a record for the new box. I took the potentially bad RAM and ran each DIMM individually through memtest86 v3.0 on my home machine. All 4 of the DIMMs ran through all tests without a single error, some of them 20+ times. This seems pretty strange because when I pulled the old RAM, Phantom started to behave like a saint. This /would/ indicate that the old RAM was bad, wouldn't it? Could the RAM be bad and memtest86 not recognize it? Seems unlikely to me. Does anyone know of any way to test the RAM further? The only things I can think of to explain this peculiar situation are:
1. one of the DIMM slots on the motherboard must be bad or 2. our kernel (2.4.20, Debian testing) can't handle 4 DIMMs totalling 512Mb or 3. the motherboard wasn't actually designed for that much memory Option 2 is a stretch. I can see options 1 or 3 (or both) being a likely cuplprit. What are your thoughts? How should we proceed to minimize down-time and find the cause of the problem? --Dave ____________________ BYU Unix Users Group http://uug.byu.edu/ ___________________________________________________________________ List Info: http://phantom.byu.edu/cgi-bin/mailman/listinfo/uug-list
