We have more then 400 machines. Every month there is one machine that we can not reboot using IPMI or the SOL is not working.
we have something like 2500 nodes, mostly HP dl145g2's, and have a BMC-wedge probably 6-12 times/year. can I ask what brand/model has such flakey IPMI? if you run "ipmi mc reset" on the node, does it resolve the problem? I wonder whether flakiness might also correspond to some config or usage pattern. (ours dhcp from a local server - actually all the traffic is local.) _______________________________________________ Beowulf mailing list, [email protected] sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
