Yael Kalka wrote:

Hi Hal,

During some windows tests we've discovered that there is still another
problem in the lid_mgr. The problem happend when 2  HCAs had the same
lid - opensm entered an infinite loop.
The following patch fixes this.

Thanks,
Yael

Signed-off-by:  Yael Kalka <[EMAIL PROTECTED]>

Index: opensm/osm_lid_mgr.c
===================================================================
--- opensm/osm_lid_mgr.c        (revision 4032)
+++ opensm/osm_lid_mgr.c        (working copy)
@@ -550,6 +550,9 @@ __osm_lid_mgr_init_sweep(
      {
              /* This port will use its local lid, and consume the entire 
required lid range.
                 Thus we can skip that range. */
+ /* If the disc_max_lid is greater then lid - we can skip right to it, + since we've done all neccessary checks on the lids in between. */
+              if (disc_max_lid > lid)
        lid = disc_max_lid;
      }
    }
@@ -593,7 +596,14 @@ __osm_lid_mgr_init_sweep(
  {
    p_range =
      (osm_lid_mgr_range_t *)cl_malloc(sizeof(osm_lid_mgr_range_t));
-    p_range->min_lid = 1;
+ /* + The p_range can be NULL in one of 2 cases:
+       1. If max_defined_lid == 0. In this case, we want the entire range.
+       2. If all lids discovered in the loop where mapped. In this case
+ no free range exists, and we want to define it after the last + mapped lid.
+    */
+    p_range->min_lid = lid;
  }
  p_range->max_lid = p_mgr->p_subn->max_unicast_lid_ho - 1;
  cl_qlist_insert_tail( &p_mgr->free_ranges, &p_range->item );


_______________________________________________
openib-general mailing list
[email protected]
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
The opensm on the show floor is showing the following in oprofile:

with a unit mask of 0x01 (mandatory) count 100000
samples  %        app name                 symbol name
5970354 51.7020 libpthread-2.3.4.so pthread_cond_timedwait@@GLIBC_2.3.2
5037621  43.6247  libosmcomp.so.1.0.0      __cl_timer_prov_cb
66241 0.5736 libosmcomp.so.1.0.0 anonymous symbol from section .plt
55929     0.4843  oprofiled                (no symbols)
49918     0.4323  opensm                   __osm_ucast_mgr_process_neighbors
39585     0.3428  vmlinux                  hpet_readl
25333     0.2194  oprofile                 (no symbols)
22734     0.1969  opreport                 (no symbols)
14724     0.1275  libcrypto.so.0.9.7a      (no symbols)
14296     0.1238  libc-2.3.4.so            __tzfile_compute
13901     0.1204  vmlinux                  __copy_to_user_ll


Is this the same loop?
_______________________________________________
openib-general mailing list
[email protected]
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Reply via email to