We have more then 400 machines. Every month there is one machine that we can
not reboot using IPMI or the SOL is not working.

we have something like 2500 nodes, mostly HP dl145g2's, and have a BMC-wedge
probably 6-12 times/year.  can I ask what brand/model has such flakey IPMI?
if you run "ipmi mc reset" on the node, does it resolve the problem?
I wonder whether flakiness might also correspond to some config or usage
pattern.  (ours dhcp from a local server - actually all the traffic is local.)
_______________________________________________
Beowulf mailing list, [email protected] sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to