On Apr 14, 2014, at 10:59 AM, Mike Dubman <mi...@dev.mellanox.co.il> wrote:

> There is no correlation between built_with and running_with versions of 
> external libraries supported by OMPI.

Ah, I see -- yes, that's the disconnect here.

I think one use case that shows this is the following:

1. Admin Bob builds Open MPI on the cluster head node with dependent library 
libfoo.so version A.B.C, which is a fully supported configuration.  Therefore, 
the appropriate configure.m4's are happy, and everything builds and installs.

2. But when User Betty goes to run, the libfoo.so on the back-end compute nodes 
is accidentally version X.Y.Z, which is *not* supported.  And Bad Things happen.

3. So you'd like to be able to run ompi_info on the head node and on the 
compute nodes and compare the output, and see an obvious difference of A.B.C 
vs. X.Y.Z in the dependent library of a given component, and use that to help 
figure out what is going wrong.

> The next release of external library does not mean we should remove code in 
> ompi for all previous supported releases for the same library.

This is another use case: OMPI was built against dep library libfoo.so A.B.C 
(which is a supported config).  But then someone does an upgrade of libfoo 
*without rebuilding OMPI*, and now OMPI run-time links against libfoo.so X.Y.Z, 
which is no longer a supported configuration.

> Why are you so against it? I don`t see any issue with printing ext lib 
> version number in the MCA description, something that can improve 
> sysadmin/user-experience.

FWIW, we've done this before by putting them in read-only MCA parameters -- 
we've called them "info" MCA params.

I don't see any in the code base today, but I know we've definitely had version 
kinds of MCA params before.

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/

Reply via email to