FWIW, this turned out to be a combination of two things:

1. Ralph's experimental DSO's were referencing a missing symbol, causing the underlying dlopen() to fail

2. LT 2.2.6 is incorrectly (IMHO) reporting "file not found" instead of "missing symbol" through lt_dlerror(), which is darned confusing (because you look at the filesystem and say, "but the file is there!!"). I have posted to the LT bug mailing list about it:

    http://lists.gnu.org/archive/html/bug-libtool/2008-10/msg00017.html



On Oct 22, 2008, at 3:36 PM, Ralph Castain wrote:

Hmmm...interesting. I see what's going on - I'm having a build system issue that is causing some of the dynamic libraries to not be seen.

Red herring - thanks for clarifying!

Camille: thanks for fixing this way back when.

Ralph


On Oct 22, 2008, at 1:17 PM, George Bosilca wrote:

Ralph,

This problem was fixed long ago by some of the work Camille did. The exact revision number is r15402 (https://svn.open-mpi.org/trac/ompi/changeset/15402 ). I'm using this feature daily and so far I had any problems with it.

To reuse your example here is what Camille came up with.

$ mpiexec --mca routed_base_verbose 30 -n 3 hostname
[dancer:09638] mca: base: components_open: Looking for routed components
[dancer:09638] mca: base: components_open: opening routed components
[dancer:09638] mca: base: components_open: found loaded component binomial [dancer:09638] mca: base: components_open: component binomial has no register function [dancer:09638] mca: base: components_open: component binomial has no open function [dancer:09638] mca: base: components_open: found loaded component direct [dancer:09638] mca: base: components_open: component direct has no register function [dancer:09638] mca: base: components_open: component direct has no open function [dancer:09638] mca: base: components_open: found loaded component linear [dancer:09638] mca: base: components_open: component linear has no register function [dancer:09638] mca: base: components_open: component linear has no open function
[dancer:09638] mca:base:select: Auto-selecting routed components
[...]

And if we force a special component:

$ mpiexec --mca routed linear --mca routed_base_verbose 30 -n 3 hostname [dancer:09642] mca: base: components_open: Looking for routed components
[dancer:09642] mca: base: components_open: opening routed components
[dancer:09642] mca: base: components_open: found loaded component linear [dancer:09642] mca: base: components_open: component linear has no register function [dancer:09642] mca: base: components_open: component linear has no open function
[dancer:09642] mca:base:select: Auto-selecting routed components
[...]

I wonder what are the configuration options you're using?

george.

On Oct 22, 2008, at 1:30 PM, Ralph Castain wrote:

I've been digging a little into optimization and found something that seems counterintuitive in the way OMPI is handling components. Specifically, if I specify a component I want used for a framework, OMPI still does a component load and open on every component in the framework - it only uses my specification during "select".

Thus, the cmd line

mpirun -mca routed linear

still results in the loading and opening of the direct and binomial components - even though we have directed the framework not to use them.

This causes us to waste memory when there is no possibility of a different component being selected. Is there a reason why "open" isn't using the mca params to guide the components it is loading?

Ralph

_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


--
Jeff Squyres
Cisco Systems

Reply via email to