On Nov 14, 2013, at 1:03 PM, Ralph Castain <r...@open-mpi.org> wrote:
>> 1) What the status of UDCM is (does it work reliably, does it support >> XRC, etc.) > > Seems to be working okay on the IB systems at LANL and IU. Don't know about > XRC - I seem to recall the answer is "no" FWIW, I recall that when Cisco was testing UDCM (a long time ago -- before we threw away our IB gear...), we found bugs in UDCM that only showed up with really large numbers of MTT tests running UDCM (i.e., 10K+ tests a night, especially with lots of UDCM-based jobs running concurrently on the same cluster). These types of bugs didn't show up in casual testing. Has that happened with the new/fixed UDCM? Cisco is no longer in a position to test this. >> 2) What's the difference between CPCs and OFACM and what's our plans >> w.r.t 1.7 there? > > Pasha created ofacm because some of the collective components now need to > forge connections. So he created the common/ofacm code to meet those needs, > with the intention of someday replacing the openib cpc's with the new common > code. However, this was stalled by the iWarp issue, and so it fell off the > table. > > We now have two duplicate ways of doing the same thing, but with code in two > different places. :-( FWIW, the iWARP vendors have repeatedly been warned that ofacm is going to take over, and unless they supply patches, iWarp will stop working in Open MPI. I know for a fact that they are very aware of this. So my $0.02 is that ofacm should take over -- let's get rid of CPC and have openib use the ofacm. The iWarp folks can play catch up if/when they want to. Of course, I'm not in this part of the code base any more, so it's not really my call -- just my $0.02... -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/