Hi Eitan,

On 7/24/07, Eitan Zahavi <[EMAIL PROTECTED]> wrote:

 *Hi Hal,*
**
*What is this "loopback" connector used for?*
*Does not seem to me like a very useful thing to do.*


Perhaps not but no reason OpenSM can't handle this more gracefully.

*Anyway, if it is not a production environment we could add a "debug mode"
(-d flag option) to ignore this check.*


Why would a separate flag be needed ?

-- Hal



*Eitan Zahavi***
Senior Engineering Director, Software Architect
Mellanox Technologies LTD
Tel:+972-4-9097208
Fax:+972-4-9593245
P.O. Box 586 Yokneam 20692 ISRAEL


 ------------------------------
*From:* Hal Rosenstock [mailto:[EMAIL PROTECTED]
*Sent:* Tuesday, July 24, 2007 5:31 PM
*To:* OpenFabrics General
*Cc:* Sasha Khapyorsky; Eitan Zahavi; Yevgeny Kliteynik
*Subject:* OpenSM detection of duplicated GUIDs on loopback


 Hi,

This is what starts off as a "minor" issue and I know it has been
discussed it somewhat in the past:

Putting a loopback connector on a (switch) link causes OpenSM to indicate
duplicated GUID error 0D18 as follows:

__osm_ni_rcv_set_links
{
...
          /*
             When there are only two nodes with exact same guids
(connected back
             to back) - the previous check for duplicated guid will not
catch
             them. But the link will be from the port to itself...
             Enhanced Port 0 is an exception to this
          */
          if ((osm_node_get_node_guid( p_node ) ==
p_ni_context->node_guid) &&
              (port_num == p_ni_context->port_num) &&
              (port_num != 0))
          {
            osm_log( p_rcv->p_log, OSM_LOG_ERROR,
                     "__osm_ni_rcv_set_links: ERR 0D18: "
                     "Duplicate GUID found by link from a port to itself:"
                     "node 0x%" PRIx64 ", port number 0x%X\n",
                     cl_ntoh64( osm_node_get_node_guid( p_node ) ),
                     port_num );
...

So this occurs over and over and over and fills the log with the same
spew. This should be improved IMO.

Is this really a fatal condition ? Doesn't seem like it should be to me.

Also, OpenSM can "ride" this out with -y (stay on fatal) but is that safe
for this condition ?

Seems like something like an extra loopback bit should be added to some
port structure which should cause these links to be ignored. This bit would
then be reset when the peer is now longer itself.

Also, is there a relationship of this with the 12x/duplicated GUID code ?

Thanks.

-- Hal


_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Reply via email to