Re: [OMPI devel] BTL move - the notion

Ralph Castain Fri, 5 Dec 2008 08:44:10 -0500

I'll answer this outside of Terry's reply so we can stay underGeorge's page limit. :-))

I don't have any philosophical opposition to the idea. Indeed, thereare places where I would potentially have some use for the btl's,perhaps as an alternative comm channel in the OOB. I will point out,though, that there are several things we thought when we started thisproject that have proven unworkable over time. For example, the ideathat the RTE could be a general purpose one without impacting OMPIproved incorrect and has been abandoned. It may well be that thenotion of using the BTL's for non-OMPI projects will fall into thatcategory as well - not saying it does, but I think it is still TBD.

That said, I do have some significant concerns about -how- this isdone that fall into two categories:


1. Procedural

Keeping the common code in the OMPI repository can raise quite a bitof trouble with synchronizing release cycles. We are just about toexit a period of requested "quiet" time on the trunk to stabilize itfor the 1.3 release. If STCI is in an active development phase, thiscould have caused a major problem as we would have demanded they notcommit to our code repository. It is easy to foresee the reversesituation. Indeed, from working on several other similar projects,this problem is not only common, but frequent. How do we intend towork this out?

I am also concerned about slowing down OMPI's development efforts dueto the need to coordinate proposed changes with an even broadercommunity, and one that will have conflicting requirements/schedules.We already have problems getting people to stay adequately involved aschanges are proposed and made, especially as the communities membershave become involved in other efforts over time. It would becomeunworkable if we take months to touch base with everyone who might beimpacted and get general consensus on changes required by OMPI. AsTerry said, we have to maintain OMPI's agility.

We all need to keep something in mind here. While this discussion isabout the BTL's and coordinating with STCI, we are talking about ageneral method of operation that will have to be extended to anyonewith a similar request. There already are other groups out there, somecompeting with STCI, that have issued similar requests for sharingvarious pieces of the code base (the ones coming to me mostly pertainto the RTE). So whatever we do should be generalizable - it can't justbe a point solution for STCI.

I am disturbed by the immediate rejection of methods developed andused by other large code projects that address this very problem. BothHg and GIT were developed specifically with this code sharingsynchronization issue in mind, and have enjoyed rapid adoption and getrave reviews for their solutions. It provides maximum flexibility, butrequires a bit of a learning curve and admittedly more attention tomaintenance details. However, other projects in similar circumstanceshave found it highly beneficial. I would think we should at leastconsider what is becoming the state-of-the-art method for code sharingbefore simply rejecting this approach as too much maintenance.



2. Technical

I think we all agree that STCI and OMPI have different objectives andrequirements. OMPI is facing the need to launch and operate at extremescales by next summer, has received a lot of interest in having itreport errors into various systems, etc. We don't have all the answersas to what will be necessary to meet these requirements, butindications so far are that tighter integration, not deeperabstraction, between the various layers will be needed. By that, Idon't mean we will violate abstraction layers, but rather that thevarious layers need to work more as a tightly tuned instrument, witheach layer operating based on a clear knowledge of how the otherlayers are functioning.

For example, for modex-less operations, the MPI/BTLs have to know thatthe RTE/OS will be providing certain information. This means that theydon't have to go out and discover it themselves every time. Yes, wewill leave that as the default behavior so that small and/or unmanagedclusters can operate, but we have to also introduce logic that candetect when we are utilizing this alternative capability and exploitit. While we are trying our best to avoid introducing RTE-like callsinto the code, the fact is that we may well have to do so (we havealready identified one btl that will definitely need to). It is simplytoo early to make the decision to cut that off now - we don't knowwhat the long-term impacts of such a decision will be.

Finally, although I don't do much on the MPI layer, I am concernedabout performance. I would tend to oppose any additional abstractionuntil we can measure the performance impact. Thus, I would like to seethe BTL move done on a tmp branch (technology to branch up to theimplementer - I don't care) so we can verify that it isn't hurting usin some unforeseeable manner.

So I guess my concerns really boil down to dealing with conflictingschedules and requirements, how to support multiple possibly competinggroups that want to share one or more parts of our code base, andretaining an OMPI-first philosophy when it comes to what changes getmade. My proposed solution is:

1. shift our repository to a technical solution that supports broadercode sharing

2. have the non-OMPI groups access our code base via that technology.They can "pull" changes at will, subject to the licensing agreement.It is true that they may have to do some local editing if the changehits a spot where they have local mods to support their system, butboth Hg and GIT are very good at handling this - much better than svnever has been.

3. if there are minor mods required to make the BTL code area easierto share via the above methods, then we should explore and implementthem. Certainly, renaming #define values would seem a no-brainer. Isuspect there are other similar things that could be done. Removingorte/opal dependencies is more controversial and would need tothoroughly be examined.

4. OMPI decides what changes get made to its code base. We are politeabout it and talk to the other groups to try and minimize impact, butultimately we do what is best for OMPI, and send out notifications(perhaps a new mailing list specifically for that purpose) whenchanges occur. Note that this would have helped the Eclipse groupenormously as otherwise they drown in the devel list trying to spotthe changes.


My $0.0002 - hope it helps
Ralph


On Dec 4, 2008, at 6:00 PM, Richard Graham wrote:

Let me start the e-mail conversation, and see how far we get.
Goal: The goal several of us have is to be able to use the btl’soutside of the MPI layer in Open MPI. The layer itself is generic,w/o specific knowledge of Upper Level Protocols, so is well suitedfor this sort of use.
Technical Approach: What we have suggested is to start the processwith the Open MPI code base, and make it independent of the mpi-layer (which it is now), and the run-time layer.
Before we get into any specific technical details,
the first question I have is are people totally opposed to thenotion of making the btl’s independent of MPI and the run-time ?This does not mean that it can’t be used by it, but that there arewell defined abstraction layers, i.e., are people against the goalin the first place ?
What are alternative suggestions to the technical approach ?
One suggestion has been to branch and patch. To me this is a long-term maintenance nightmare.
What are peoples thoughts here ?

Rich

_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

Re: [OMPI devel] BTL move - the notion

Reply via email to