I don't think so. From what I understand the iboffload component may not live 
much longer because of
Mellanox's fork of Cheetah. So, it might not matter.

-Nathan

Excuse the *&(#$y Outlook posting-style. OWA sucks.
________________________________________
From: devel [devel-boun...@open-mpi.org] on behalf of Ralph Castain 
[r...@open-mpi.org]
Sent: Thursday, November 14, 2013 12:58 PM
To: Open MPI Developers
Subject: Re: [OMPI devel] [EXTERNAL] Re: [OMPI svn-full]        svn:open-mpi    
r29703  - in trunk:     contrib/platform/iu/odin        ompi/mca/btl/openib     
ompi/mca/btl/openib/connect

The key question, though, is: has anyone checked to see if the ofacm code even 
works any more??

Only oob and xoob components appear to be present - so unless someone fixed 
those since they were originally copied from openib, I doubt ofacm works.


On Nov 14, 2013, at 11:08 AM, Shamis, Pavel <sham...@ornl.gov> wrote:

> There is some confusion in the thread. UDCM is just another CPC, like XOOB, 
> OOB, and RDMACM (I think IBCM is officially dead).
> XOOB and OOB don't use UDCM, they relay on ORTE out-of-band communication.
>
> OpenIB/connect supports UDCM,XOOB,OOB, and RDMACM
> OFACM supports (at least last time when we checked) OOB and XOOB
>
> RDMACM was not moved to OFACM, because of iWarp's "first message" requirement 
> that used to break the abstraction.
> Moreover RDMACM scalability used to be terrible, as a result no one in IB 
> community really used it.
> The situation is a bit different today, since ROCEE relays on RDMACM. It 
> worth noting that you may setup
> ROCEE connections with a regular OOB with a some restrictions (we did it for 
> mvapich-1).
>
> The code between ofacm and openib is similar, but NOT the same. We change the 
> API in a way that it allows
> to hide XRC QP management (there is hash table that manages QP to EP mapping) 
> in OFACM instead of OPENIB.
> This made openib initialization code a bit cleaner. Here is my old tree with 
> openib btl changes https://bitbucket.org/pasha/ofacm
>
> I hope it helps,
>
> Best,
> Pasha
>
> On Nov 14, 2013, at 1:17 PM, Joshua Ladd <josh...@mellanox.com> wrote:
>
>> Unless someone went in and "fixed" the code in common (judging by the 
>> comments, fixed seems to imply porting (x)oob to use UDCM, which hasn't been 
>> done at all in the context of xoob and is incompletely patched and remains 
>> unusable as a replacement for oob in 1.7.4), there is no reason to believe 
>> it would work any different than the cpcs under btl/openib/connect. IIRC, 
>> it's the same code - copy/pasted - just moved to a common location so 
>> Cheetah collectives can do their wireup. So, if oob cpc doesn't work, ofacm 
>> oob won't work either and, I guess, by extension, Cheetah IBoffload won't 
>> work. Pasha, correct me if you know different.
>>
>>
>> Josh
>>
>>
>> -----Original Message-----
>> From: devel [mailto:devel-boun...@open-mpi.org] On Behalf Of Ralph Castain
>> Sent: Thursday, November 14, 2013 1:05 PM
>> To: Open MPI Developers
>> Subject: Re: [OMPI devel] [EXTERNAL] Re: [OMPI svn-full] svn:open-mpi r29703 
>> - in trunk: contrib/platform/iu/odin ompi/mca/btl/openib 
>> ompi/mca/btl/openib/connect
>>
>>
>> On Nov 14, 2013, at 9:33 AM, Barrett, Brian W <bwba...@sandia.gov> wrote:
>>
>>> On 11/14/13 9:51 AM, "Jeff Squyres (jsquyres)" <jsquy...@cisco.com> wrote:
>>>
>>>> Does XRC work with the UDCM CPC?
>>>>
>>>>
>>>> On Nov 14, 2013, at 9:35 AM, Ralph Castain <r...@open-mpi.org> wrote:
>>>>
>>>>> I think the problems in udcm were fixed by Nathan quite some time
>>>>> ago, but never moved to 1.7 as everyone was told that the connect
>>>>> code in openib was already deprecated pending merge with the new
>>>>> ofacm common code. Looking over at that area, I see only oob and
>>>>> xoob - so if the users of the common ofacm code are finding that it
>>>>> works, the simple answer may just be to finally complete the switchover.
>>>>>
>>>>> Meantime, perhaps someone can CMR and review a copying of the udcm
>>>>> cpc to the 1.7 branch?
>>>>>
>>>>>
>>>>> On Nov 14, 2013, at 5:14 AM, Joshua Ladd <josh...@mellanox.com> wrote:
>>>>>
>>>>>> Um, no. It's supposed to work with UDCM which doesn't appear to be
>>>>>> enabled in 1.7.
>>>>>>
>>>>>> Per Ralph's comment to me last night:
>>>>>>
>>>>>> "... you cannot use the oob connection manager. It doesn't work and
>>>>>> was deprecated. You must use udcm, which is why things are supposed
>>>>>> to be set to do so by default. Please check the openib connect
>>>>>> priorities and correct them if necessary."
>>>>>>
>>>>>> However, it's never been enabled in 1.7 - don't know what "borked"
>>>>>> means, and from what Devendar tells me, several UDCM commits that
>>>>>> are in the trunk have not been pushed over to 1.7:
>>>>>>
>>>>>> So, as of this moment, OpenIB BTL is essentially dead-in-the-water
>>>>>> in 1.7.
>>>>>>
>>>>>>
>>>>>>
>>>
>>> I'm going to start by admitting that I haven't been paying attention
>>> to IB the last couple of months, so I'm out of my league a little bit
>>> here.  I remember discussions of UDCM replacing OOB both because the
>>> OOB CPC had some issues and because it would make it easier to move
>>> the BTLs to the OPAL layer (ie, below the OOB).  But I also thought
>>> that was more future work than it clearly was.  So can someone let me know:
>>>
>>> 1) What the status of UDCM is (does it work reliably, does it support
>>> XRC, etc.)
>>
>> Seems to be working okay on the IB systems at LANL and IU. Don't know about 
>> XRC - I seem to recall the answer is "no"
>>
>>> 2) What's the difference between CPCs and OFACM and what's our plans
>>> w.r.t 1.7 there?
>>
>> Pasha created ofacm because some of the collective components now need to 
>> forge connections. So he created the common/ofacm code to meet those needs, 
>> with the intention of someday replacing the openib cpc's with the new common 
>> code. However, this was stalled by the iWarp issue, and so it fell off the 
>> table.
>>
>> We now have two duplicate ways of doing the same thing, but with code in two 
>> different places. :-(
>>
>>> 3) Someone mentioned that ofacm oob worked, but cpc oob didn't.  Can
>>> someone explain why?
>>
>> I'm not sure that is actually true as there is no indication that anyone is 
>> using or testing the collective components that use ofacm code.
>>
>>
>>>
>>> Again, sorry for being dense; I've been spending too much time in
>>> Portals land lately.
>>>
>>> Brian
>>>
>>> --
>>> Brian W. Barrett
>>> Scalable System Software Group
>>> Sandia National Laboratories
>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> devel mailing list
>>> de...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

Reply via email to