On Sep 30, 2010, at 6:53 PM, da...@lang.hm wrote:

> On Thu, 30 Sep 2010, Giovanni Tirloni wrote:
>
>> Hello,
>>
>> Recently during an electrical maintenance, we faced a problem with  
>> some
>> servers that had redundant PSUs. After the power was shut down on the
>> circuit that serves the first PSU, the second PSU failed to keep some
>> servers up and they rebooted (came back normal and stayed stable  
>> after
>> that). Tonight the same procedure was done on the other power  
>> circuit and
>> the second PSU failed too (on a smaller number of machines). These  
>> are all
>> enterprise-level servers which vendors will promptly replace failed  
>> PSUs..
>> but these PSUs were working fine as far as we can tell. Has anyone  
>> had this
>> problem too?
>>
>> I'm looking for some advice regarding proactive PSU replacements.  
>> Is it a
>> common practice? We do replace disks as proactively as we can by  
>> monitoring
>> several performance metrics but for PSUs I'm at a loss here.
>
> PSU's can die over time, and there can (and have been) flaws in  
> design or components that will cause similar devices to start  
> failing at around the same timeframe.
>
> If I start having them fail on several servers, I accelerate efforts  
> to replace that generation of servers.
>
> the most common thing to fail in a PSU are the capacitors, and if  
> the vendor had a bad batch of caps that made it into the power  
> supplies, I would be reluctant to just replace the PSUs and put the  
> systems back into mission-critical use, if those caps are bad, how  
> can I really trust the other caps in the system?
>
> I am in the process of doing this with a batch of systems purchased  
> 5-6 years ago.
>
> David Lang_______________________________________________


Very true about them failing in batches, but.. the PSUs are almost  
never made by the same manufacturer as the motherboard etc. We had a  
batch of Sparkle PSUs all fail within months of each other, but the  
Supermicro motherboards are still going strong years later..


Jonathan



_______________________________________________
Tech mailing list
Tech@lopsa.org
http://lopsa.org/cgi-bin/mailman/listinfo/tech
This list provided by the League of Professional System Administrators
 http://lopsa.org/

Reply via email to