Don't know much about Dell specifically, however I'll offer some guidance.
If the Broadcom part has the tg3 driver, you may be out of luck depending on the failure state. For example, BCM5704 chips fundamentally cannot provide BMC access while executing PXE. On the other hand, bnx2 managed chips tend to fare better, there generally is at least one way to make it work correctly, though drivers and nic firmware matter *greatly* still. Not as resilient as I would like, but with precautions in how you manage firmware and drivers, it's workable. You'll want to check your tg3/bnx2/whatever driver version and NIC firmware version, depending on your investigation. Shared nics can work great, but some implementations can be picky about what drivers and firmware are in place. Also, newer is not always better, sometimes a developer without caring about the IPMI access provided by some nics will unwittingly break it somehow in the driver, and it won't get fixed until some server vendor or other industrious administrator stumbles across it. From: Rahul Nabar <rpna...@gmail.com> To: Jarrod B Johnson/Raleigh/i...@ibmus Cc: ipmitool-devel@lists.sourceforge.net, linux-powere...@dell.com Date: 08/30/2010 10:52 AM Subject: Re: [Ipmitool-devel] Is the BMC robust to recover from system hangs? impitool unresponsive On Mon, Aug 30, 2010 at 7:52 AM, Jarrod B Johnson <jbjoh...@us.ibm.com> wrote: Your BMC simply isn't responding to any traffic. BMCs are supposed to be completely resilient to OS failures when done properly (not much apart from things like power failures in non-redundant systems should be capable of knocking out a quality IPMI implementation) . You need to look to your system vendor's support for an explanation and/or resolution, since implementations vary greatly from one vendor to the next. Sometimes a vendor is not competent to make it work, sometimes a vendor is too cheap to make it easy, and sometimes a vendor simply hasn't covered your particular NIC driver/OS combination and the NIC vendor flubbed some register handling or some such to make the NIC shoot itself when the kernel panics. Thanks for the tips Jarrod! I will look into the nodes. These are DellR410-servers with the on-board Broadcom NIC. The first thing for this Monday morning is for me to trudge down to the dark depths of the cluster room and to manually log in and see what exactly happened to these nodes. I'll post on the list if I find anything interesting -- Rahul
<<inline: graycol.gif>>
<<inline: ecblank.gif>>
------------------------------------------------------------------------------ Sell apps to millions through the Intel(R) Atom(Tm) Developer Program Be part of this innovative community and reach millions of netbook users worldwide. Take advantage of special opportunities to increase revenue and speed time-to-market. Join now, and jumpstart your future. http://p.sf.net/sfu/intel-atom-d2d
_______________________________________________ Ipmitool-devel mailing list Ipmitool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ipmitool-devel