Title: RE: [openib-general] opensm fails to bring up subnet..

Hi,
Sorry for catching up with this late in the thread. (Thanks Hal for waking me up...)
>
> It appears that a node is not responding to a discovery packet (SM Get
> NodeInfo (attrID 0x11)). It's direct route initial path (an array of
> port numbers at the start of the next hop) is:
> Initial path = [1][81][1] which means that starting at the node running
> OpenSM, port 1 then port 129 then port 1. Is there a large switch in the
> middle ? Can you send the output of ibnetdiscover ? If that is valid,
> which HCA (port) is not responding (what is the GUID) ?
[EZ] Normally all directed route dumps should start with:
Initial path = [0][....
The first hop is reserved to 0 - so I wonde if the above text is a direct quote from the osm.log ?
The fact you got there a [81] means that the packet should leave from port 81 ??
I have never seen a switch with more then 24 ports...

> Unfortunately on such an error osm does not appear to give up  (it
> retries forever and is locked on such a node). This is obviously not
> good.
Also Troy if you are able to capture the entire log it might put some light on the issue of "OpenSM never give up" on such cases - which we want to resolve.

_______________________________________________
openib-general mailing list
[email protected]
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Reply via email to