We've had that happen on some of our servers. Currently using the
disable_msi workaround, which seems to have stopped it. I believe
there's supposed to be a fix in the latest Red Hat kernel but we haven't
really tested that yet.
You loose all network connectivity (including IPMI) to the server - not
all connectivity, so e.g. serial console (not SOL, proper serial
console, or using a console server) still works (as would a locally
attached keyboard/monitor). Unless you require network to log in :) . If
one runs into this, it's a really weird one (before you find the bug
report) - to all appearances, the server works happily, no strangeness
in the logs - just network gone completely.
It's not one to trigger easily - hard to track down sort of thing. Had
610s and 710s for a while before this first happened (and loads we never
saw it on, still). We first saw it on a rather heavily used NFS server
(i.e. lots of network I/O).
Tina
Cris Rhea wrote:
In case it helps anyone using Dell R410 / 610 / 710 etc. servers: I have had
machines lose their eth connections periodically (CentOS 5.4 bnx2 driver).
Seems like a bug with the Broadcom NIC drivers. [luckily read of it on a
Dell mailing list]
Bug Reports:
http://kbase.redhat.com/faq/docs/DOC-26837
http://patchwork.ozlabs.org/patch/51106
Not sure yet if this is exactly my issue but I'm giving it a shot now.
Thought I'd post since, anecdotally I've seen many people use these servers
on the list.
--
Rahul
I've been following this on the Dell list as I have approx. 50 R410s
in our cluster.
One thing that isn't clear-- When this happens, do you lose all
connectivity to the node (i.e., do you have to reboot the node to
re-establish eth0)?
My R410s are running CentOS 5.2 - 5.4 and I rarely have one go
down.
--- Cris
--
Tina Friedrich, Computer Systems Administrator, Diamond Light Source Ltd
Diamond House, Harwell Science and Innovation Campus - 01235 77 8442
_______________________________________________
Beowulf mailing list, [email protected] sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf