I'm going to "re-integrate" Jeff and Brian's comments into one reponse.
I have no problem with either of their observations. I only included the event library, backtrace, and PLPA in my list for completeness. I expected we would continue to treat those as we are, recognizing that this means -someone- is going to have to step up to support those when we need to update them. In the event library case, I know people have talked about a major change coming soon - a release that has significant improvement we may care about. Not sure when that might happen, or who is going to do that integration. As to ROMIO: as with many of the community's "planned" contributions, they have tended to fade with time and personnel turnover. At this time, there is no way LANL could support a ROMIO integration without a significant delay to the proposed 1.3 release schedule. Not that such a delay particularly bothers me - I don't see a pressing need to just throw something out there, and I have been beaten severely around the neck-and-shoulders the last two days about how out of date our ROMIO version is, and that it lacks a critical Panasas patch that is severely impacting performance. I'll continue to talk to people here about possibly getting help with ROMIO. I don't know the prospects, but it will take some time for someone to become familiar enough with our code base/build system to make a real contribution. Alternatively, -I- may have to take this on, which will definitely delay the 1.3 RTE work, effectively just transferring the "blocker" from one part of the code to another. ;-) But we can deal with that on a separate thread. For now, I think Jeff's last response to the other thread is where we are converging: delay work on a 3rd party contribution system until we have more cycles, but don't bring more 3rd party code (post-libNBC) in until we have a better mechanism. Ralph On 2/8/08 9:06 AM, "Jeff Squyres" <jsquy...@cisco.com> wrote: > On Feb 8, 2008, at 10:38 AM, Ralph Castain wrote: > >> I thought maybe we should move this to another thread as it really >> isn't >> about Torsten's specific RFC. >> >> I just took a quick gander at the code base to see how extensive this >> problem might really be per Terry's concern. What I found was that >> we have >> added 3rd party code in several places. How we want to define them >> in terms >> of this issue is probably something for discussion. >> >> Packages I could readily identify include: >> >> 1. event library >> 4. backtrace >> 5. PLPA - this one is a little less obvious, but still being >> released as a >> separate package > > FWIW, these packages are part of "core" OMPI and are not especially > problematic. We upgrade them when we have a need or desire to (which > has been low frequency); we don't try to stay in sync with their > release schedules at all. > >> 2. ROMIO > > ROMIO has traditionally been a problem (keeping up with its releases > and patches). We have long-since agreed that we definitely want to > include ROMIO in our tarball, even though that presents challenges. > One thing that makes it *slightly* easier is that Brian added the > mechanics for OMPI to use a ROMIO that is outside of Open MPI rather > than the one that is bundled with it. It's not a perfect solution, > but it does help some. > >> 3. VT >> 6. libNBC > > These two are definitely in the "contrib" category. > >> There may well be others - these are only the ones I know about. By >> 3rd >> party package, I mean these are blocks of code obtained as a complete, >> distinct version and "dropped in" to the OMPI code repository, and >> then to >> some degree tied into our build system. They are not code specifically >> developed for OMPI by OMPI developers. > > Those are all that I'm aware of. > >> We have already discussed the issues with this approach. I am >> particularly >> concerned with the maintenance and release cycle issues right now. >> >> If these packages could be linked to our code instead of embedded >> within it, >> then it seems to me that updating them could become much easier. For >> example, we could download and install the latest ROMIO + Panasas >> patch, >> compile it, and simply link it into libompi - without occupying >> someone with >> constantly fixing the build system issues, etc. > > FWIW: > > - event,backtrace,PLPA,ROMIO are included in OMPI because we wanted to > certify them as part of "core" OMPI. That is, we wanted to certify > the whole system (vs. relying on [untested] combinations of versions > that already exist on users' systems). > > - ROMIO is likely the only one of that group that presents ongoing > logistics problems. The mechanism Brian added was seen as a > workaround. Argonne will definitely need to be involved at some level > to improve the ROMIO integration. Some talks started between Brian, > me, and Rob(ANL) about a) making our integration better/easier, and b) > having access to the ROMIO SVN to be able to suck down releases when > we want to, but they kinda tapered off (Brian left and I got other > priorities). There was also talk of LANL maintaining its own ROMIO > tree and pushing it into OMPI, but I don't know what happened there. > I can help with part of the ROMIO make-the-integration-easier (not in > the immediate future, though -- probably not for a few weeks), but I > do not think that I can do it on an ongoing basis. Note, too, that > ROMIO is no longer distributed as a separate package -- it's only > included in MPICH2. So it's a little harder to just link against a > ROMIO that is already installed on a system -- there won't be one that > isn't already bundled with an MPI. > > - vt and libnbc are a different category; they are add-on > functionality, not "core" OMPI. > >> Obviously, I don't claim to know enough about what was done to >> integrate >> ROMIO to know if this would easily work. I only use it to illustrate >> the >> point - the same could be said about the event library, for example. >> >> Given our maintenance support problems, it would seem to me that >> changing >> the way we do 3rd party packaging may be worth consideration and some >> effort. I can't prioritize that relative to 1.3, though I do note >> that, from >> LANL's perspective, the ROMIO issue is a definite blocker for 1.3 >> release. > > Hmm. This is odd because of the prior statements about ROMIO from > LANL (that LANL was going to maintain ROMIO and push it into OMPI). > I'm assuming that's changed? > > If ROMIO is a v1.3 blocker for LANL, can LANL commit resources to > fixing the problem?