> As for SMART on disks - I've never heard of SMART giving more > than 24 hours notice of a failing disk, but I have heard of many > cases where disks have died without prior warning even though > SMART monitoring has been active. Other people seem to have the > same experience. I get the impression that SMART monitoring > may be a waste of (a relatively small number of) CPU cycles.
I guess experiences will vary. I know that on our Compaq servers, the SMART warnings are wonderful. Just recently I swapped out a drive that was giving off warnings about too many sectors having to be remapped. And in the case of Compaq servers, if you're using all the Insight Manager agents, a drive is covered under warranty *before* it's even failed (that good old pre-failure warranty), so you can swap it out even though technically it's still working okay. I've done that several times on Compaq systems, and I think it's wonderful. But I guess it all depends on the monitoring software in place. All SMART is going to tell you is the # of errors and other drive indicators, and I guess it's up to the software to determine if those errors constitute normal behaviour or if there's something more serious involved. Same goes with the CPU and fan monitoring... those Compaq systems monitor fan rotation, temperatures at multiple points inside the case, etc. There are pre-defined settings for those sensors, so that if the temperature goes above, you can either have the system shut down or keep on running but send an alert (on a server, it's preferable to have it send an alert, otherwise having a system shut down without any notice can be aggravating, but I guess that's a matter of whether or not you have 24/7 human monitoring of alerts, so you can react quickly to those). Granted, Compaq servers with all those features cost a pretty penny. Their desktops offer most of the same things, albeit not as many sensors inside the case. :) I think I've mentioned this before, but it is interesting to see how when I kick off Prime95 on one of those servers, the fans which normally are just idling will kick into high gear. They're variable speed, so they only kick in when needed, and it doesn't take long at all for the CPU's to heat up and trigger the extra fan rpm's. On a lot of the newest Compaq servers, you can put in redundant fans so if one or two fail, you're still covered, and you also get better cooling when all are running. Granted, these things can be QUITE loud (I have 3 on my desk right now and it is annoying, but thank goodness they're just the slim DL360's). The huge cluster boxes or 8-way servers can deafen a person, not to mention the Storageworks boxes with those monstrous vortex fans on the back, pumping who knows how many CFM's and generating quite a few decibels in the process. :) Aaron _________________________________________________________________________ Unsubscribe & list info -- http://www.ndatech.com/mersenne/signup.htm Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers
