On Wed, Dec 12, 2007 at 01:35:33PM -0500, Jeff Squyres wrote: > I agree with Gleb's idea. More below. > > On Dec 12, 2007, at 12:24 PM, Jon Mason wrote: > > > Ok, glad I got this conversation started :) > > > > So, we need a slight redesign to determine the cm method (unless > > forced > > via commandline arg). This can be determined by calling all the > > individual open routines, and having them return a priority based on > > their ability to function. For example, the xoob open function will > > check the mca_btl_openib_component.num_xrc_qps for a non-zero value > > and > > return the priority based on that. > > > > Of course, if forced then it will only call that specific open > > function > > and throw any relevant errors as necessary. > > > Close, but I'd do it slightly differently: > > - open() is *only* used for creating MCA params. It's a bad name, but > it's unfortunately the precedent throughout the rest of the OMPI code > base. :-\ (it has roots in the ompi_info command -- ompi_info has to > be able to get a full list of all MCA params regardless of what > hardware is available on the current system) > > - during the openib component startup, we should add a query() > function that does what you describe. I.e., we query() each endpoint > and it either returns a valid priority or "I don't want to be used > with this endpoint." > > - there should be a priority MCA param for every CPC. Perhaps the CPC > base can handle this...? I'm not sure; it may need to be down in each > CPC. > > - the list of CPCs that want to run with each endpoint are ordered by > priority (ties will be arbitrarily, but deterministically, broken -- > alphabetical?) and sent around in the modex. > > - when a new connection comes up, the intersection of the CPC lists > for the near and far endpoints is computed and the highest priority > CPC is used to make the connection. Since everyone has the same data, > both sides will make the same decision. > > - CPC init may have to change a bit -- more than one CPC may be used > for a given endpoint because both the local module and the remote > module are involved in making the decision of which CPC is used. > > After this first cut is done, we should probably also add > btl_openib_cpc_include and btl_openib_cpc_exclude as I described in a > prior mail (just like *_if_include and *_if_exclude in several BTLs) > to include/exclude sets of CPCs at run-time. > > > If this sounds sane, then let me know and I'll start coding it up. > > > This has actually been on my to-do list for too long; if you have the > cycles to do this now, it would be great...
Since I need to have it done before I can do my rdma_cm bits, I'll add this to my queue and get started immediately. > > I'll make you a bargain: if you do the stuff above, I'll add in the > configure/build mojo for selectively compiling the XOOB CPC or not > (depending on whether the underlying system has XRC library support or > not). Cool? > > Let's go off on a /tmp-public branch for this so we don't hose the > trunk... I just made /tmp-public/openib-cpc. > > -- > Jeff Squyres > Cisco Systems > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel