FWIW, I asked for the additional data that Hal requested. But this time there are no occurrences of "Disconnected switch|HCA" errors from 'ibdiagnet -r'.
The entire cluster was recently rebooted (probably the IB switches, too), opensm restarted, etc. So that seems to have cleared things up, at least for now. But this is something that we've seen on quite a few occasions, so we'll keep looking for it, and grab what debug info we can when it crops up again. -- Arthur _______________________________________________ general mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
