On Jul 21, 2008, at 6:57 PM, Brian W. Barrett wrote:

I guess I don't understand. I thought there were three versions in every
component -- the MCA version, the framework version, and the component
version. The first two should determine if the component can safely be loaded and the third is to identify the component. I agree that for this change (an MCA-level change), the MCA version *should* change. However,
the framework interface didn't change (well, not as a result of this
change), meaning that the framework version *should not* change. The MCA load infrastructure should see that the MCA versions don't match, and not
load the component.


Josh and I wrestled with this question for a bit and probably fell down on the side of conservatism; that's where this came from. There were two reasons why we went this way:

1. You could (for example) have a coll framework v1.2.3 component built with MCA v1.0.0 and the same coll framework v1.2.3 component built against MCA v2.0.0, and they would be different. Worse, they won't be "equal". Specifically, MCA 2.0.0 supports some minor features that v1.0.0 doesn't -- so even though you have 2 of the "same" component, they're not really the same. (*more on this below)

2. Another issue seemed pretty icky to solve, which led us to fall down a little heavier on the side of bumping all the framework version numbers. Let's say you have some Foo framework DSOs, some of which are MCA v1.0.0 and some of which are v2.0.0. The Foo framework interface is the same between the two. The MCA base can find/open all of them easily enough; but how do we return all the components to the caller? I could think of 3 ways:

A. return multiple lists to the caller: a list of each of v1.0.0 and v2.0.0 components. This means that every framework will need to handle (or be able to reject or specify to the MCA base to reject before even accepting as available) both MCA v1.0.0 and v2.0.0 components.

B. return a single list to the caller with both MCA component versions in the list. Pretty much the same as #1, but it scales better if we get in the business of changing the MCA version a lot (please God no); I mention it mainly for completeness.

C. return a single list to the caller with all components "upgraded" to MCA v2.0. This seems like a nice solution -- a la the experiment we tried with coll a long time ago to prove to ourselves that run-time versioning could work (for those of you who don't remember: we had some coll v1.0.0 and some v1.1.0 components; the coll base transparently handled everything at run-time). However, there's a problem with this idea: since all frameworks use the component struct as a "super" for their component structs, the MCA base does not know the total size of the component public struct. So it cannot "upgrade" the MCA v1.0.0 structure in memory to a v2.0.0, because the v2.0.0 struct is bigger than the v1.0.0 struct. So we can't just magically treat everything as v2.0.0 components at the MCA base level; we'd have to have the frameworks transmorgify their own components (although we might be able to have some MCA base helper function that does the heavy lifting, as long as the framework supplied the total struct length).

Note that all three of these solutions involves touching every framework in some way (although not every component).

----

All that being said, I suppose there's two arguments against these kinds of issues:

- this situation probably won't happen in practice (component A compiled against MCA v1.0.0 and against MCA v2.0.0) because we only distribute components as part of full OMPI releases, and therefore they're fairly tightly bound to their MCA version. However, for components that didn't change between OMPI v1.2 and v1.3, you *will* have this scenario, but in different OMPI installation directories (and therefore it pretty much doesn't matter).

- I think the crux of Brian's argument is the framework's version number is identifying *the framework's* interface -- not the whole interface (i.e., not including the MCA base interface). From this perspective, it *is* independent of the MCA version number. Specifically: the version of the framework interface is independent of the binary compatibility and features issues surrounding the MCA base.

-----

So Josh and I thought we picked a solution that was clear, simple, and one-of-several sucky options. :-\ We could probably be convinced to go another way if someone has strong feelings.

--
Jeff Squyres
Cisco Systems

Reply via email to