Thanks for the insightful info. Yes, as another user had suggested privately, I was running memtest86 since pretty much my post last night (early morning).
Thus far 16 passes, running almost 17 hours and no errors. Although, I know, and as you pointed out, no errors doesn't really rule out bad memory module(s). I'm going to try swapping out modules, maybe I'll get lucky. --- Marcus Watts <[EMAIL PROTECTED]> wrote: > > I've not see this type of problem before, so I > > turn to you guys. Is this a sign that maybe > > a drive is going bad? Or sign of bad memory? > > > > What's going on here!? I know it is almost > > Halloween and all, but this is kinda _spooky_ > > to say the least. > > > > > > Idea? Please? :-) > > Hard drives contain lots of moving parts, a known reliability risk. > Therefore most if not all modern hard disks and associated logic > contain more or less elaborate internal self-checking logic to detect > failing media, failing spindle motor, failing head positioning > mechanism, over and under voltage, bus driver failure, etc. Most of > these will result in kernel messages and/or other obvious signs of > system distress. Your "dmesg" (assuming it was done after the failed > build) doesn't show any evidence of such problem, so there's no reason > to suspect a hard disk going bad. > > More likely possibilities are bad memory, a bad motherboard, > incompatible memory, bad disk controller, mis-configured bus speeds, > environmental problem, or possibly but less likely, a bad cpu. Memory > is simple: if you buy a "consumer grade" home machine, you get memory > that has no self-check logic. A chip going bad could well produce the > problems you show below. A "server class" machine will nearly always > contain ECC memory. A few companies (Dell, Sun) also make "commercial > grade" desktop machines, which usually also contain ECC. Note that > most "home computer" stores and even many professionals don't understand > or value ECC memory, and will steer you away from such technology. > > If it's memory, even without self-check logic that may still be easy to > see if it's broken. "memcheck86+" has a good reputation. This is a > stand-alone program, which you can leave running overnight. If it > fails memcheck86+, then the problem is obvious. If it passes, the > memory is still not in the clear; for instance, it's in theory possible > for the memory to fail when accessed by DMA but not by the processor. > If you can get the memory to fail more or less predictably, and you > have multiple memory modules, you may be able to play remove & swap > games to identify which module is bad. Check your hardward doc first - > on some systems, modules may need to be paired in some particular > fashion. > > It is certainly worth checking your machine for obvious physical > problems. For instance, check air paths to ensure they aren't > blocked. Be suspicious of burning smells, obvious heat, excessive fan > noise, or lack of distinct air flow. Check the inside of the machine. > Is there excessive dust build-up? Are the fan blades clean? Do the > fans spin very smoothly and fairly freely? Are the cables in the way? > Are there any loose cables? Loose boards? Bad solder joints or > cracks? (On most modern motherboards, it's not worth spending much > time checking this if it's not easy to get to; removing the motherboard > may itself cause damage, and even a "large" crack sufficient to produce > complete failure may be nearly impossible to spot). Other signs of > physical distress? Ideally you want your machine to be in a > climate-controlled environment comfortable to people. Dust, very dry > air, excessive moisture, temperature cycles, etc. are all bad. > Electrically conductive dust can become particularly exciting. > > An older or fancier machine may have a separate disk controller, in > which case if you have a spare it may be worth swapping. Your machine > is probably not one of these. > > On many newer machines, the BIOS can contain settings which alter the > speed or timing of various bus components. Getting this wrong can > produce subtle weirdness, or obvious and drammatic signs of failure. > It may take a while for subtle weirdness to manifest itself in any > obvious fashion. If you have ECC memory, make sure the bios knows that. > > Sorting all this out can take time. If the machine is an older one, it > may be cheaper to replace it than figure out what failed. > > Also, in case you missed it, building large software packages is > an excellent way to burn a new machines in or establish > that an existing machine is reliable. :-) > > -Marcus ____________________________________________________________________________________ Get your email and see which of your friends are online - Right on the New Yahoo.com (http://www.yahoo.com/preview)

