On Sep 3, 2010, at 09:50 , Brice Goglin wrote: > Le 03/09/2010 15:38, George Bosilca a écrit : >> Jeff, >> >> I think you will have to revert this patch as the btl_bandwidth __IS__ >> supposed to be in Mbs and not MBs. We usually talk about networks in Mbs >> (there is a pattern in Ethernet 1G/10G, Myricom 10G). In addition the >> original design of the multi-rail was based on this assumption, and the >> multi-rail handling code deal with these values (at that level I don't think >> it really matters, but at least it needs consistent values from all BTLs). >> >> However, going over the existing BTLs I can see that some BTLs do not >> correctly set this value: >> >> BTL Bandwidth Auto-detect Status >> Elan 2000 NO Correct >> > > 2000 looks strange to me. Last time I played with Elan4, bandwidth was > 900MB/s or so.
Lucky you ;) The 2000 was the bandwidth of the last Elan device we had. > >> GM 250 NO Doubtful >> MX 2000/10000 YES (Mbs) Correct (before the patch) >> OFUD 800 NO Doubtful >> OpenIB 2000/4000/8000 YES (Mbs) Correct (multiplied by the >> active_width) >> > > I found the problem when using both MX and OpenIB at the same time, so > they can't be both wrong or both correct. IB was reporting 800, not > 2000/4000/8000. Maybe because auto-detect didn't work and the default is > wrong: > btl_openib_mca.c:527: mca_btl_openib_module.super.btl_bandwidth = 800; It appears that Open IB only auto-detect the bandwidth if the value is explicitly set to zero via the mca parameters. As a last resort: as for the other devices you can set it manually. Use something like btl_openib_bandwidth_%dev_name% to set the bandwidth per device. george. > > Brice > > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel