>> We have more then 400 machines. Every month there is one machine that we >> can >> not reboot using IPMI or the SOL is not working. > > we have something like 2500 nodes, mostly HP dl145g2's, and have a > BMC-wedge > probably 6-12 times/year. can I ask what brand/model has such flakey > IPMI? > if you run "ipmi mc reset" on the node, does it resolve the problem? > I wonder whether flakiness might also correspond to some config or usage > pattern. (ours dhcp from a local server - actually all the traffic is > local.)
Mark, Do you have SOL on the HP DL145-G2 ? I also have these nodes, and although I can use most ipmi functions (including remote access power up/cycle), I can not get SOL to work. Also, i have noticed that the kipmi0 daemon does consume a little bit, e.g., 45 minutes for 9 days uptime (with the top default refresh, it shows up every 4 screens or so). (CentOS 5.3) Regards, paulo -- Paulo Afonso Lopes | Tel: +351- 21 294 8536 Departamento de Informática | 294 8300 ext.10702 Faculdade de Ciências e Tecnologia | Fax: +351- 21 294 8541 Universidade Nova de Lisboa | e-mail: [email protected] 2829-516 Caparica, PORTUGAL _______________________________________________ Beowulf mailing list, [email protected] sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
