On Mon, Jan 09, 2012 at 09:55:58AM -0800, Freddie Cash wrote: > On Mon, Jan 9, 2012 at 9:50 AM, John Nielsen <li...@jnielsen.net> wrote: > > From what you've said I strongly suspect that you have some kind of > > hardware issue. Dodgy RAM is my first guess, something cooling-related is > > my 2nd, and PSU is my 3rd. It is a little suspicious that you only started > > having problems after your upgrade but it could be coincidence or it could > > be something about the new software tickling the hardware differently than > > the old. > > That's what we're leaning toward as well. We're planning on doing a > BIOS upgrade (betadrive is running v2.00 and alphadrive is v1.00), > then a memtest86+ run, then check firmware on the SATA controllers.
For hardware/system troubleshooting advice: 1) BIOS upgrade -- since this is also what's responsible for ACPI bits and other "configuration model" pieces of a system, 2) BIOS settings -- make sure they're all 100% identical between both systems, 3) Controller firmware -- please make sure these are the same (your controllers between boxes appear to be the same model), 4) Flaky PSU -- possibly voltages drop or raise below/above levels which the mainboard can handle. As someone who buys Supermicro exclusively for their systems, I can tell you that their PSUs ("Ablecom") are quite cheap/horrible. It's worth purchasing a replacement -- if it doesn't turn out to be the problem, you now have a spare PSU (which is good to have -- our last systems failure was due to a blown PSU). 5) Flaky RAM -- memtest86+ can help here, mostly but not entirely. 6) Flaky mainboard -- it happens. Really. :-) For OS advice: Compare rc.conf, loader.conf, and so on. For example, is one system using powerd(8) while the other isn't? > If none of the above helps, we're thinking of swapping the CPUs > between the two systems to see if the problems stay with the box or > follow the CPU. I was helping out someone on a public forum earlier this week who purchased a Dell desktop system that started behaving oddly. memtest86+ claimed all his DIMMs were bad (regardless of slot), and replacement DIMMs claimed the same thing. Dell kept insisting he reload the OS, else they can try a motherboard swap, blah blah blah. What amused me was that nobody looked at the CPU: Intel Core i3-550, which contains an on-die MCH. Chances are the MCH is going bad, which means time to replace the CPU. CPUs rarely go bad, but now with on-die MCHs, on-die VGA, etc. it's becoming much more plausible that the physical CPU needs to be replaced. They've become practically computers inside of a computer. :-) -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB | _______________________________________________ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"