On Wed, Oct 08, 2008 at 08:30:12AM +0200, Mister Olli wrote: > hi... > > thanks for the feedback on this topic. > the first step to clean the machine and check all connectors has been > done yesterday. I hope that this will fix the problem, and that it's not > some kind of hardware failure. > > to run tests with memtest is quite a problem, since the machine has high > availability requirements. to take it off for nearly one hour for > cleaning and checking during daily work of our company was a pain. > 6 hours or more of RAM tests is not possible. > > is there some other way to detect hardware failure with less time > consuming tool/ process?
Yes -- you start replacing hardware one piece at a time until the problem goes away. That will also require downtime, quite regularly, and waste money. So to answer your question: no, there is no way to easily track down the source of a hardware failure, or determine what piece has failed (if any). This is completely 100% normal when it comes to computers, especially x86 PCs. Anyone who has worked in the IT field for many years knows this. :-) I'm amazed that in this day and age, any company would have a single host as a single-point-of-failure. You can't take this machine down for troubleshooting, but you have no failover available. The company has put themselves into this situation. -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | _______________________________________________ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "[EMAIL PROTECTED]"