I tracked down the issue to a bug in osm_lid_mgr.c

function:  __osm_lid_mgr_init_sweep(...)

The bad hardware was retutning an assigned LID of 0xFFFF. In this function there is a loop
as follows where opensm is getting stuck.. (with line number)

    392   p_port_guid_tbl = &p_mgr->p_subn->port_guid_tbl;
    393
    394   for( p_port = (osm_port_t*)cl_qmap_head( p_port_guid_tbl );
    395        p_port != (osm_port_t*)cl_qmap_end( p_port_guid_tbl );
    396        p_port = (osm_port_t*)cl_qmap_next( &p_port->map_item ) )
    397   {
    398     osm_port_get_lid_range_ho(p_port, &disc_min_lid, &disc_max_lid);
    399     for (lid = disc_min_lid; lid <= disc_max_lid; lid++)                  <===== Bug here
    400       cl_ptr_vector_set(p_discovered_vec, lid, p_port );
    401   }

Since the disc_max_lid and disc_min_lid are 0xFFFF, and these are unsigned 16 bit numbers, the condition
in the for loop never becomes false, and opensm is stuck in the loop.  There are couple of other places in that
function that needs fixing too.

-Viswa


On 9/27/05, Viswanath Krishnamurthy <[EMAIL PROTECTED]> wrote:
Log sent off-list...

-Viswa


On 9/27/05, Eitan Zahavi < [EMAIL PROTECTED]> wrote:
Hi Viswa,

Please send a full /var/log/osm.log file of opensm -V .
You can send us a copy off the list if it is too big:

yael and eitan in @ mellanox.co.il

EZ

Hal Rosenstock wrote:
> On Mon, 2005-09-26 at 19:57, Viswanath Krishnamurthy wrote:
>
>>I have an exerciser in the IB network. The exerciser seems to be
>>faulty/buggy. When opensm starts I do not
>>see 'SUBNET UP" message. It says "Entering MASTER"  and waits there.
>>Any new node inserted in this state is not assigned any LID.   Anybody
>>seen such behavior ?
>
>
> Any idea on how the IB exerciser misbehaves on the network ? Do you have
> an analyzer too ?
>
> What does the OSM log show ?
>
> -- Hal
>
> _______________________________________________
> openib-general mailing list
> [email protected]
> http://openib.org/mailman/listinfo/openib-general
>
> To unsubscribe, please visit
> http://openib.org/mailman/listinfo/openib-general
>



_______________________________________________
openib-general mailing list
[email protected]
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Reply via email to