On Sep 3, 2010, at 9:38 AM, George Bosilca wrote:

> I think you will have to revert this patch as the btl_bandwidth __IS__ 
> supposed to be in Mbs and not MBs. We usually talk about networks in Mbs 
> (there is a pattern in Ethernet 1G/10G, Myricom 10G).

This is why I shouldn't commit patches for others, and why I'm glad I pushed 
Scott to commit the other fixes himself...

I'll revert; you, Scott, and Brice figure out what you want to do.

> In addition the original design of the multi-rail was based on this 
> assumption, and the multi-rail handling code deal with these values (at that 
> level I don't think it really matters, but at least it needs consistent 
> values from all BTLs).
> 
> However, going over the existing BTLs I can see that some BTLs do not 
> correctly set this value:
> 
> BTL     Bandwidth        Auto-detect     Status
> Elan    2000                NO           Correct
> GM      250                 NO           Doubtful
> MX      2000/10000          YES (Mbs)    Correct (before the patch)
> OFUD    800                 NO           Doubtful
> OpenIB  2000/4000/8000      YES (Mbs)    Correct (multiplied by the 
> active_width)
> Portals 1000                NO           Doubtful
> SCTP    100                 NO           Conservative value (correct)
> Self    100                 XXX          Correct (doesn't matter anyway)
> SM      9000                NO           Correct
> TCP     100                 NO           Conservative value (correct)
> UDAPL   225                 NO           Incorrect
> 
> Some of these BTL values do not make sense, neither in Mbs or MBs. Here is a 
> list of such BTLs: OFUD, Portals, UDAPL. If the corresponding developers can 
> provide the default bandwidth (in Mbs) I will update their values.

OFUD should be just like OpenFabrics.  But I doubt anyone cares.  Should we 
remove it?

UDAPL intentionally hides that kind of stuff; I don't know if it's possible to 
get it.  Rolf/Terry?

> For SCTP, TCP I don't know how to detect it reliably in a portable way, so I 
> expect to let them set to this very conservative value. Moreover, the BTL TCP 
> is only used for multi-rail if the available high performance network allows 
> it, so it doesn't really matter.

Some servers have 1GB and 10GB TCP, though...

It might be worth having even a Linux-specific way to auto-detect, just for 
this use case (which is becoming more common -- 1GB LOM and a 10GB non-iWARP 
NIC).

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/


Reply via email to